Search Results (1 - 25 of 463 Results)

Sort By  
Sort Dir
 
Results per page  

Imbulgoda Liyangahawatte, Gihan Janith MendisHardware Implementation and Applications of Deep Belief Networks
Master of Science in Engineering, University of Akron, 2016, Electrical Engineering
Deep learning is a subset of machine learning that contributes widely to the contemporary success of artificial intelligence. The essential idea of deep learning is to process complex data by abstracting hierarchical features via deep neural network structure. As one type of deep learning technique, deep belief network (DBN) has been widely used in various application fields. This thesis proposes an approximation based hardware realization of DBNs that requires low hardware complexity. This thesis also explores a set of novel applications of the DBN-based classifier that will benefit from a fast implementation of DBN. In my work, I have explored the application of DBN in the fields of automatic modulation classification method for cognitive radio, Doppler radar sensor for detection and classification of micro unmanned aerial systems, cyber security applications to detect false data injection (FDI) attacks and localize flooding attacks, and applications in social networking for prediction of link properties. The work in this thesis paves the way for further investigation and realization of deep learning techniques to address critical issues in various novel application fields.

Committee:

Jin Wei (Advisor); Arjuna Madanayaka (Committee Co-Chair); Subramaniya Hariharan (Committee Member)

Subjects:

Artificial Intelligence; Computer Engineering; Electrical Engineering; Engineering; Experiments; Information Technology

Keywords:

deep belief networks; multiplierless digital architecture; Xilinx FPGA implementations; low-complexity; applications of deep belief networks; spectral correlation function; modulation classification; drone detection; doppler radar; cyber security

Chippa, Mukesh KGoal-seeking Decision Support System to Empower Personal Wellness Management
Doctor of Philosophy, University of Akron, 2016, Computer Engineering
Obesity has reached epidemic proportions globally, with more than one billion adults overweight with at least three hundred million of them clinically obese; this is a major contributor to the global burden of chronic disease and disability. This can also be associated with the rising health care costs; in the USA more than 75\% of health care costs relate to chronic conditions such as Diabetes and Hypertension. While there are various technological advancements in fitness tracking devices such as Fitbit, and many employers offer wellness programs, such programs and devices have not been able to create societal scale transformations in the life style of the users. The challenge in keeping healthy people healthy and helping them to be intrinsically motivated to manage their own health is at the focus for this investigation on Personal Wellness Management. In this dissertation, this problem is presented as a decision making under uncertainty where the participant takes an action at discrete time steps and the outcome of the action is uncertain. The main focus is to formulate the decision making problem in the Goal-seeking framework. To evaluate this formulation, the problem was also formulated in two classical sequential decision-making frameworks --- Markov Decision Process and Partially Observable Markov Decision Process. The sequential decision-making frameworks allow us to compute optimal policies to guide the participants' choice of actions. One of the major challenges in formulating the wellness management problem in these frameworks is the need for clinically validated data. While it is unrealistic to find such experimentally validated data, it is also not clear that the models in fact capture all the inconstraints that are necessary to make the optimal solutions effective for the participant. The Goal-seeking framework offers an alternative approach that does not require explicit modeling of the participant or the environment. This dissertation presents a software system that is designed in the Goal-seeking framework. The architecture of the system is extensible. A modular subsystem that is useful to visualize exercise performance data that are gathered from a Kinect camera is described.

Committee:

Shivakumar Sastry, Dr (Advisor); Nghi Tran, Dr (Committee Member); Igor Tsukerman, Dr (Committee Member); William Schneider IV, Dr (Committee Member); Victor Pinheiro, Dr (Committee Member)

Subjects:

Computer Engineering

Keywords:

decision support system, personalized wellness management, Goal seeking paradigm, markov decision process, partially observable markov decision process

Scott, Kevon KOcclusion-Aware Sensing and Coverage in Unmanned Aerial Vehicle (UAV) Networks
MS, University of Cincinnati, 2016, Engineering and Applied Science: Computer Engineering
The use of small and miniature Unmanned Aerial Vehicles (UAVs) for remote sensing and surveillance applications has become increasingly popular in the last two decades. Networks of UAVs, capable of providing flexible aerial views over large areas, are playing important roles in today's distributed sensing systems. Since camera sensors are sensitive to occlusions, it is more challenging to deploy UAVs for sensing in geometrically complex environments, such as dense urban areas and mountainous terrains. The intermittent connectivity in a sparse UAV network also makes it challenging to efficiently gather sensed multimedia data. This thesis is composed of two pieces of work. In the first piece of work, a new occlusion-aware UAV coverage technique with the objective of sensing a target area with satisfactory spatial resolution subject to the energy constraints of UAVs is proposed. An occlusion-aware waypoint generation algorithm is first designed to find the best set of waypoints for taking pictures in a target area. The selected waypoints are then assigned to multiple UAVs by solving a vehicle routing problem (VRP), which is formulated to minimize the maximum energy for the UAVs to travel through the waypoints. A genetic algorithm is designed to solve the VRP problem. Evaluation results show that the proposed coverage technique can reduce energy consumption while achieving better coverage than traditional coverage path planning techniques for UAVs. In the second piece of work, a communication scheme is designed to deliver the images sensed by a set of mobile survey UAVs to a static base station through the assistance of a relay UAV. Given the planned routes of the survey UAVs, a set of relay waypoints are found for the relay UAV to meet the survey UAVs and receive the sensed images. An Online Message Relaying technique (OMR) is proposed to schedule the relay UAV to collect images. Without any global collaboration between the relay UAV and the survey UAVs, OMR utilizes a markov decision process (MDP) that determines the best schedules for the relay UAV such that the image acquisition rate could be maximized. Evaluation results show that the proposed relaying technique outperforms traditional relaying techniques, such as the traveling salesman problem (TSP) and the random walk, in terms of end-to-end delay and frame delivery ratio.

Committee:

Rui Dai, Ph.D. (Committee Chair); Dharma Agrawal, D.Sc. (Committee Member); Carla Purdy, Ph.D. (Committee Member)

Subjects:

Computer Engineering

Keywords:

Unmanned Aerial Vehicle;UAV;Occlusion;FANET;Flying Ad-Hoc Networks;Remote Sensing

Abu Doleh, AnasHigh Performance and Scalable Matching and Assembly of Biological Sequences
Doctor of Philosophy, The Ohio State University, 2016, Electrical and Computer Engineering
Next Generation Sequencing (NGS), the massive parallel and low-cost sequencing technology, is able to generate an enormous size of sequencing data. This facilitates the discovery of new genomic sequences and expands the biological and medical research. However, these big advancements in this technology also bring big computational challenges. In almost all NGS analysis pipelines, the most crucial and computationally intensive tasks are sequence similarity searching and de novo genome assembly. Thus, in this work, we introduced novel and efficient techniques to utilize the advancements in the High Performance Computing hardware and data computing platforms in order to accelerate these tasks while producing high quality results. For the sequence similarity search, we have studied utilizing the massively multithreaded architectures, such as Graphical Processing Unit (GPU), in accelerating and solving two important problems: reads mapping and maximal exact matching. Firstly, we introduced a new mapping tool, Masher, which processes long~(and short) reads efficiently and accurately. Masher employs a novel indexing technique that produces an index for huge genome, such as the human genome, with a small memory footprint such that it could be stored and efficiently accessed in a restricted-memory device such as a GPU. The results show that Masher is faster than state-of-the-art tools and obtains a good accuracy and sensitivity on sequencing data with various characteristics. Secondly, maximal exact matching problem has been studied because of its importance in detection and evaluating the similarity between sequences. We introduced a novel tool, GPUMEM, which efficiently utilizes GPU in building a lightweight indexing and finding maximal exact matches inside two genome sequences. The index construction is so fast that even by including its time, GPUMEM is faster in practice than state-of-the-art tools that use a pre-built index. De novo genome assembly is a crucial step in NGS analysis because of the novelty of discovered sequences. Firstly, we have studied parallelizing the de Bruijn graph based de novo genome assembly on distributed memory systems using Spark framework and GraphX API. We proposed a new tool, Spaler, which assembles short reads efficiently and accurately. Spaler starts with the de Bruijn graph construction. Then, it applies an iterative graph reduction and simplification techniques to generate contigs. After that, Spaler uses the reads mapping information to produce scaffolds. Spaler employs smart parallelism level tuning technique to improve the performance in each of these steps independently. The experiments show promising results in term of scalability, execution time and quality. Secondly, we addressed the problem of de novo metagenomics assembly. Spaler may not properly assemble the sequenced data extracted from environmental samples. This is because of the complexity and diversity of the living microbial communities. Thus, we introduced meta-Spaler, an extension of Spaler, to handle metagenomics dataset. meta-Spaler partitions the reads based on their expected coverage and applies an iterative assembly. The results show an improving in the assembly quality of meta-Spaler in comparison to the assembly of Spaler.

Committee:

Umit Catalyurek (Advisor); Kun Huang (Committee Member); Fusun Ozguner (Committee Member)

Subjects:

Bioinformatics; Computer Engineering

Keywords:

bioinformatics;sequence similarity;indexing;graphical processing unit;Apache Spark;de Bruijn graph;de novo assembly;metagenomics

Wang, MingyangImproving Performance And Reliability Of Flash Memory Based Solid State Storage Systems
PhD, University of Cincinnati, 2016, Engineering and Applied Science: Computer Science and Engineering
Flash memory based Solid State Disk systems (SSDs) are becoming increasingly popular in enterprise applications where high performance and high reliability are paramount. While SSDs outperform traditional Hard Disk Drives (HDDs) in read and write operations, they pose some unique and serious challenges to I/O and file system designers. The performance of an SSD has been found to be sensitive to access patterns. Specifically read operations perform much faster than write ones, and sequential accesses deliver much higher performance than random accesses. The unique properties of SSDs, together with the asymmetric overheads of different operations, imply that many traditional solutions tailored for HDDs may not work well for SSDs. The close relation between performance overhead and access patterns motivates us to design a series of novel algorithms for I/O scheduler and buffer cache management. By exploiting refined access patterns such as sequential, page clustering, block clustering in a per-process per-file manner, a series of innovative algorithms on I/O scheduler and buffer cache can deliver higher performance of the file system and SSD devices. Other than the performance issues, SSDs also face some unique reliability challenges due to the natural properties of flash memory. Along with the well-known write-endurance, flash memory also suffers from read-disturb and write-disturb. Even repeatedly reading from an SSD may cause data corruption because the read voltage may stress neighboring memory cells. As the density of flash memory keeps increasing, the disturbing problems are becoming even more severe for memory cells to store data reliably. One of the structural merits of an SSD is its internal parallelism. Such parallelism of flash memory chips could be exploited to support data redundancy in a similar fashion to traditional HDD RAID. Recently an emerging non-volatile memory (NVM) such as PCM is receiving increasing research interest, as it outperforms flash memory by providing in-place update and better performance and reliability. Hybrid solutions, which combine both flash memory and NVM to balance performance and cost, are under special investigation to address the reliability and performance issues of flash memory based storage systems. To address the reliability concerns, we present a novel storage architecture called i-RAID (internal RAID) that introduces RAID-like parity-based redundancy while avoiding many of its problems. What make i-RAID so unique like no other are its deferred parity maintenance, selective RAID protection and dynamic RAID organization. It solves traditional RAID’s small update problem and avoids SSD RAID pitfalls. Unlike traditional disk drives, SSDs cannot perform in-place updates. We view this unique characteristic as an opportunity instead of a hurdle. The out-of-place update feature means that old data will not be over-written by the new data, which enables us to design some fundamentally new algorithms that defer the computing and updating of parity blocks until the garbage collection time, thereby significantly reducing the overhead and possibly increasing the life-time of SSDs. Our algorithms also dynamically and selectively construct parity stripes only on aged, error-prone blocks, and utilize the internal parallelism of SSDs to further improve performance.

Committee:

Yiming Hu, Ph.D. (Committee Chair); Kenneth Berman, Ph.D. (Committee Member); Karen Davis, Ph.D. (Committee Member); Wen-Ben Jone, Ph.D. (Committee Member); Carla Purdy, Ph.D. (Committee Member)

Subjects:

Computer Engineering

Keywords:

Flash Memory;RAID;Solid State Disk;Non Volatile Memory;Write Endurance;Read Write Disturb

Yuan, YuanAdvanced Concurrency Control Algorithm Design and GPU System Support for High Performance In-Memory Data Management
Doctor of Philosophy, The Ohio State University, 2016, Computer Science and Engineering
The design and implementation of data management systems have been significantly affected by application demands and hardware advancements. On one hand, with the emerging of various new applications, the traditional one-size-fits-all data management system has evolved into domain specific systems optimized for each application (e.g., OLTP, OLAP, streaming, etc.). On the other hand, with increasing memory capacity, and advancements of multi-core CPUs and massive parallel co-processors (e.g., GPUs), the performance bottleneck of data management systems have shifted from I/O to memory accesses, which has led a constructive re-design of data management systems for memory resident data. Although many in-memory systems have been developed to deliver much better performance than that of disk-based systems, they all face the challenge of how to maximize the system’s performance by massive parallelism. In this Ph.D. dissertation, we explore how to design high performance in-memory data management systems for massive parallel processors. We have identified three critical issues of in-memory data processing. First, Optimistic Concurrency Control (OCC) method has been commonly used for in-memory databases to ensure transaction serializability. Although OCC can achieve high performance at low contention, it causes large number of unnecessary transaction aborts at high contention, which wastes system resources and significantly degrades database throughput. To solve the problem, we propose a new concurrency control method named Balanced Concurrency Control (BCC) that can more accurately abort transactions while maintaining OCC’s merits at low contention. Second, we study how to use the massive parallel co-processor GPUs to improve the performance of in-memory analytical systems. Existing works have demonstrated GPU’s performance advantage over CPU on simple analytical operations (e.g., join), but it is unclear how to optimize complex queries with various optimizations. To address the issue, we comprehensively examine analytical query behaviors on GPUs and design a new GPU in-memory analytical system to efficiently execute complex analytical workloads. Third, we investigate how to use GPUs to accelerate the performance of various analytical applications on production-level distributed in-memory data processing systems. Most of existing GPU works adopt a GPU-centric design, which completely redesigns a system for GPUs without considering the performance of CPU operations. It is unclear how much a CPU-optimized, distributed in-memory data processing system can benefit from GPUs. To answer the question, we use Apache Spark as a platform and design Spark-GPU that has addressed a set of real-world challenges incurred by the mismatches between Spark and GPU. Our research includes both algorithm design and system design and implementation in the form of open source software.

Committee:

Xiaodong Zhang (Advisor)

Subjects:

Computer Engineering; Computer Science

Keywords:

Advanced Concurrency Control Algorithm Design, GPU System Support, High Performance In-Memory Data Management

Alsubail, Rayan A.Aesthetics vs. Functionality in User Prompt Design: A Mobile Interface Usability Study on the iOS Touch ID Feature
Master of Science (MS), Ohio University, 2015, Computer Science (Engineering and Technology)
The usability of smartphone software presents unique challenges as compared to desktop software. Both aesthetics and functionality play an important role in mobile interface design. In this paper, we examined the usability of the iOS Touch ID feature with different user prompts. We compared three different types of user prompt designs for the touch ID feature, including a user prompt with no guidance (NG), a user prompt with aesthetic-first guidance design (AF), and a user prompt with functionality-first guidance design (FF). An experiment with 30 participants showed an improvement for 90% of them when using the FF prompt for the fingerprint inputs, as compared to when using the AF prompt. Additionally, the fingerprint inputs were improved for all participants using the FF prompt as compared to the NG prompt. We concluded that user prompt designs do have a material impact on the usability of mobile software, and that functionality rather than aesthetics should be the primary consideration in user prompt design.

Committee:

Chang Liu (Advisor); Frank Drews (Committee Member); Jundong Liu (Committee Member)

Subjects:

Computer Engineering; Computer Science; Experiments

Keywords:

Usability; Touch ID Usability; Touch ID; iPhone Usability; Interface Design; User Prompt Design; Aesthetic; Aesthetic Design; Aesthetic and Usability

Sedaghati Mokhtari, NaseraddinPerformance Optimization of Memory-Bound Programs on Data Parallel Accelerators
Doctor of Philosophy, The Ohio State University, 2016, Computer Science and Engineering
High performance applications depend on high utilization of memory bandwidth and computing resources, and data parallel accelerators have proven to be very effective in providing both, when needed. However, memory bound programs push the limits of system bandwidth, causing under-utilization in computing resources and thus energy inefficient executions. The objective of this research is to investigate opportunities on data parallel accelerators (i.e., SIMD units and GPUs) and design solutions for improving the performance of three classes of memory-bound applications: stencil computation, sparse matrix-vector multiplication (SpVM) and graph analytics. This research first focuses on performance bottlenecks of stencil computations on short-vector SIMD ISAs and presents StVEC, a hardware-based solution for extending the vector ISA and improving data movement and bandwidth utilization. StVEC includes an extension to the standard addressing mode of vector floating-point instructions in contemporary vector ISAs (e.g. SSE, AVX, VMX). A code generation approach is designed and implemented to help a vectorizing compiler generate code for processors with StVEC extensions. Using an optimistic as well as a pessimistic emulation of the proposed StVEC instructions, it is shown that the proposed solution can be effective on top of SSE and AVX capable processors. To analyze hardware overhead, parts of the proposed design are synthesized using a 45nm CMOS library and shown to have minimal impact on processor cycle time. As the second class of memory-bound programs, this research has focused on sparse matrix-vector multiplications (SpMV) on GPUs and shown that no sparse matrix representation is consistently superior, with the best representation being dependent on the matrix sparsity patterns. This part focuses on four standard sparse representations (i.e. CSR, ELL, COO and a hybrid ELL-COO) and studies the correlations between SpMV performance and the sparsity features. The research then uses machine learning techniques to automatically select the best sparse representation for a given matrix. Extensive characterization of pertinent sparsity features is performed on around 700 sparse matrices and their SpMV performance with different sparse representations. Applying learning on such a rich dataset leads to developing a decision model to automatically select the best representation for a given sparse matrix on a given target GPU. Experimental results on three GPUs demonstrate that the approach is very effective in selecting the best representation. The last part is dedicated to characterizing performance of graph processing systems on GPUs. It focuses on a vertex-centric graph programming framework (Virtual Warp Centric, VWC), and characterizes performance bottlenecks when running different graph primitives. The analysis shows how sensitive the VWC parameter is to the input graph and signifies the importance of selecting the correct warp size in order to avoid performance penalties. The study also applies machine learning techniques on the input dataset in order to predict the best VWC configuration for a given graph. It shows the applicability of simple machine learning models to improve performance and reduce the auto-tuning time for graph algorithms on GPUs.

Committee:

Ponnuswamy Sadayappan (Advisor); Louis-Noel Pouchet (Committee Member); Mircea-Radu Teodorescu (Committee Member); Atanas Ivanov Rountev (Committee Member)

Subjects:

Computer Engineering; Computer Science; Engineering

Keywords:

Stencil Computation, GPU, CUDA, SpMV, Graph Processing, Performance Analysis, SIMD

Wang, KaiboAlgorithmic and Software System Support to Accelerate Data Processing in CPU-GPU Hybrid Computing Environments
Doctor of Philosophy, The Ohio State University, 2015, Computer Science and Engineering
Massively data-parallel processors, Graphics Processing Units (GPUs) in particular, have recently entered the main stream of general-purpose computing as powerful hardware accelerators to a large scope of applications including databases, medical informatics, and big data analytics. However, despite their performance benefit and cost effectiveness, the utilization of GPUs in production systems still remains limited. A major reason behind this situation is the slow development of supportive GPU software ecosystem. More specially, (1) CPU-optimized algorithms for some critical computation problems have irregular memory access patterns with intensive control flows, which cannot be easily ported to GPUs to take full advantage of its fine-grained, massively data-parallel architecture; (2) commodity computing environments are inherently concurrent and require coordinated resource sharing to maximize throughput, while existing systems are still mainly designed for dedicated usage of GPU resources. In this Ph.D. dissertation, we develop efficient software solutions to support the adoption of massively data-parallel processors in general-purpose commodity computing systems. Our research mainly focuses on the following areas. First, to make a strong case for GPUs as indispensable accelerators, we apply GPUs to significantly improve the performance of spatial data cross-comparison in digital pathology analysis. Instead of trying to port existing CPU-based algorithms to GPUs, we design a new algorithm and fully optimize it to utilize GPU’s hardware architecture for high performance. Second, we propose operating system support for automatic device memory management to improve the usability and performance of GPUs in shared general-purpose computing environments. Several effective optimization techniques are employed to ensure the efficient usage of GPU device memory space and to achieve high throughput. Finally, we develop resource management facilities in GPU database systems to support concurrent analytical query processing. By allowing multiple queries to execute simultaneously, the resource utilization of GPUs can be greatly improved. It also enables GPU databases to be utilized in important application areas where multiple user queries need to make continuous progresses simultaneously.

Committee:

Xiaodong Zhang (Advisor); P. Sadayappan (Committee Member); Christopher Stewart (Committee Member); Harald Vaessin (Committee Member)

Subjects:

Computer Engineering; Computer Science

Keywords:

GPUs, Memory Management, Operating Systems, GPU Databases, Resource Management, Digital Pathology

Chen, LinMEASUREMENTS OF AUTOCORRELATION FUNCTIONS USING A COMBINATION OF INTRA- AND INTER-PULSES
Master of Science, Miami University, 2015, Computational Science and Engineering
Incoherent scatter radar (ISR) is a versatile tool to study the ionosphere by measuring the autocorrelation function (ACF). Accurate ACF in the E-region is difficult to obtain because the relative short range limits the length of a pulse. The short correlation time of the ionosphere renders the correlation using the pulse-to-pulse technique useless. In the thesis, we study a method that combines intra-pulse and inter-pulse techniques and apply it to the data taken at Arecibo Observatory. We show simultaneously measured ACF’s at short and long lags and summarize the merits of ACF. Applications of ACF and its advantages are discussed. The technique used here will make the derivation of ionosphere parameters more accurate.

Committee:

Qihou Zhou (Advisor); Chi-Hao Cheng (Committee Member); Dmitriy Garmatyuk (Committee Member)

Subjects:

Aeronomy; Aerospace Engineering; Computer Engineering; Computer Science; Earth; Radiology

Keywords:

Incoherent scatter radar; ionosphere; E-region; parameters; autocorrelation function; accurate

Lipkin, IlyaTesting Software Development Project Productivity Model
Doctor of Philosophy in Manufacturing and Technology Management, University of Toledo, 2011, Manufacturing and Technology Management

Software development is an increasingly influential factor in today’s business environment, and a major issue affecting software development is how an organization estimates projects. If the organization underestimates cost, schedule, and quality requirements, the end results will not meet customer needs. On the other hand, if the organization overestimates these criteria, resources that could have been used more profitably will be wasted.

There is no accurate model or measure available that can guide an organization in a quest for software development, with existing estimation models often underestimating software development efforts as much as 500 to 600 percent. To address this issue, existing models usually are calibrated using local data with a small sample size, with resulting estimates not offering improved cost analysis.

This study presents a conceptual model for accurately estimating software development, based on an extensive literature review and theoretical analysis based on Sociotechnical Systems (STS) theory. The conceptual model serves as a solution to bridge organizational and technological factors and is validated using an empirical dataset provided by the DoD.

Practical implications of this study allow for practitioners to concentrate on specific constructs of interest that provide the best value for the least amount of time. This study outlines key contributing constructs that are unique for Software Size E-SLOC, Man-hours Spent, and Quality of the Product, those constructs having the largest contribution to project productivity. This study discusses customer characteristics and provides a framework for a simplified project analysis for source selection evaluation and audit task reviews for the customers and suppliers.

Theoretical contributions of this study provide an initial theory-based hypothesized project productivity model that can be used as a generic overall model across several application domains such as IT, Command and Control, Simulation and etc¿¿¿ This research validates findings from previous work concerning software project productivity and leverages said results in this study. The hypothesized project productivity model provides statistical support and validation of expert opinions used by practitioners in the field of software project estimation.

Committee:

Jeen Su Lim (Committee Chair); James Pope (Committee Member); Michael Mallin (Committee Member); Michael Jakobson (Committee Member); Wilson Rosa (Advisor)

Subjects:

Aerospace Engineering; Armed Forces; Artificial Intelligence; Business Administration; Business Costs; Computer Engineering; Computer Science; Economic Theory; Economics; Electrical Engineering; Engineering; Industrial Engineering; Information Science; Information Systems; Information Technology; Management; Marketing; Mathematics

Keywords:

"Software Estimation"; "Software Cost Model"; "Department of Defense Data"; COCOMO; "Software Project Productivity Model"

Zheng, YuLow-cost and Robust Countermeasures against Counterfeit Integrated Circuits
Doctor of Philosophy, Case Western Reserve University, 2015, EECS - Computer Engineering
Counterfeit integrated circuits (ICs) in a supply chain have emerged as a major threat to the semiconductor industry with serious potential consequences, such as reliability degradation of an end product and revenue/reputation loss of the original manufacturer. Counterfeit ICs come in various forms, including aged chips resold in the market, remarked/defective dies, and cloned unauthorized copies. In many cases, these ICs would have minor functional, structural and parametric deviations from genuine ones, which make them extremely difficult to isolate through conventional testing approaches. On the other hand, existing design approaches that aim at facilitating identification of counterfeit chips often incur unacceptable design and test cost. In this thesis, we present novel low-overhead and robust solutions for addressing various forms of counterfeiting attacks in ICs. The solutions presented here fall into two classes: (1) test methods to isolate counterfeit chips, in particular cloned or recycled ones; and (2) design methods to authenticate each IC instance with unique signature from each chip. The first set of solutions is based on constructing robust fingerprint of genuine chips through parametric analysis after mitigating the process variations. The second set of solutions is based on novel low-cost physical unclonable functions (PUFs) to create unique and random signature from a chip for reliable identification of counterfeit instances. We propose two test methods with complementary capabilities. The first one primarily targets cloned ICs by constructing the fingerprint from scan path delays. It uses the scan chain, a prevalent design-for-testability (DFT) structure, to create a robust authentication signature. A practical method based on clock phase sweep is proposed to measure small delay of scan paths with high resolution. The second one targets isolation of aged chips under large inter- and intra-die process variations without the need of any golden chips. It is based on comparing dynamic current fingerprints from two adjacent and self-similar modules (e.g., different parts of an adder) which experience differential aging. We propose two delay-based PUFs built in the scan chain which convert scan path delays into robust authentication signature without affecting testability. Another novel PUF structure is realized in embedded SRAM array, an integral component in modern processors and system-on-chips (SoCs), with virtually no design modification. It leverages on voltage-dependent memory access failures (during write) to produce large volume of high-quality challenge-response pairs. Since many modern ICs integrate SRAM array of varying size with isolated power grid, the proposed PUF can be easily retrofitted into these chips. Finally, we extend our work to authenticate counterfeit printed circuit boards (PCBs) based on extraction of boundary-scan path delay signatures from each PCB. The proposed approach exploits the standard boundary scan architecture based on IEEE 1149.1 standard to create unique signature for each PCB. The design and test approaches are validated through extensive simulations and hardware measurements, whenever possible. These approaches can be effectively integrated to provide nearly comprehensive protection against various forms of counterfeiting attacks in ICs and PCBs.

Committee:

Swarup Bhunia (Committee Chair); Christos Papachristou (Committee Member); Frank Merat (Committee Member); Philip Feng (Committee Member); Christian Zorman (Committee Member)

Subjects:

Computer Engineering

Keywords:

Counterfeit Integrated Circuits; Chip Fingerprint; Physical Unclonable Function; Golden-free Detection; Design for Test

Zhao, XiaojiaCASA-BASED ROBUST SPEAKER IDENTIFICATION
Doctor of Philosophy, The Ohio State University, 2014, Computer Science and Engineering
As a primary topic in speaker recognition, speaker identification (SID) aims to identify the underlying speaker(s) given a speech utterance. SID systems perform well under matched training and test conditions. In real-world environments, mismatch caused by background noise, room reverberation or competing voice significantly degrades the performance of such systems. Achieving robustness to the SID systems becomes an important research problem. Existing approaches address this problem from different perspectives such as proposing robust speaker features, introducing noise to clean speaker models, and using speech enhancement methods to restore clean speech characteristics. Inspired by auditory perception, computational auditory scene analysis (CASA) typically segregates speech from interference by producing a time-frequency mask. This dissertation aims to address the SID robustness problem in the CASA framework. We first deal with the noise robustness of SID systems. We employ an auditory feature, gammatone frequency cepstral coefficient (GFCC), and show that this feature captures speaker characteristics and performs substantially better than conventional speaker features under noisy conditions. To deal with noisy speech, we apply CASA separation and then either reconstruct or marginalize corrupted components indicated by a CASA mask. We find that both reconstruction and marginalization are effective. We further combine these two methods into a single system based on their complementary advantages, and this system achieves significant performance improvements over related systems under a wide range of signal-to-noise ratios (SNR). In addition, we conduct a systematic investigation on why GFCC shows superior noise robustness and conclude that nonlinear log rectification is likely the reason. Speech is often corrupted by both noise and reverberation. There have been studies to address each of them, but the combined effects of noise and reverberation have been rarely studied. We address this issue in two phases. We first remove background noise through binary masking using a deep neural network (DNN) classifier. Then we perform robust SID with speaker models trained in selected reverberant conditions, on the basis of bounded marginalization and direct masking. Evaluation results show that the proposed method substantially improves SID performance compared to related systems in a wide range of reverberation time and SNRs. The aforementioned studies handle mixtures of target speech and non-speech intrusions by taking advantage of their different characteristics. Such methods may not apply if the intrusion is a competing voice, which is of similar characteristics as the target. SID in cochannel speech, where two speakers are talking simultaneously over a single recording channel, is a well-known challenge. Previous studies address this problem in the anechoic environment under the Gaussian mixture model (GMM) framework. On the other hand, cochannel SID in reverberant conditions has not been addressed. This dissertation studies cochannel SID in both anechoic and reverberant conditions. We first investigate GMM-based approaches and propose a combined system that integrates two cochannel SID methods. Secondly, we explore DNNs for cochannel SID and propose a DNN-based recognition system. Evaluation results demonstrate that our proposed systems significantly improve SID performance over recent approaches in both anechoic and reverberant conditions and various target-to-interferer ratios.

Committee:

DeLiang Wang, Professor (Advisor); Eric Fosler-Lussier, Professor (Committee Member); Mikhail Belkin, Professor (Committee Member)

Subjects:

Computer Engineering; Computer Science

Ma, TaoA Framework for Modeling and Capturing Social Interactions
PhD, University of Cincinnati, 2015, Engineering and Applied Science: Electrical Engineering
The understanding of human behaviors in the scope of computer vision is beneficial to many different areas. Although great achievement has been made, human behavior research investigations are still targeted on isolated, low-level, and individual activities without considering other important factors, such as human-human interactions, human-object interactions, social roles, and surrounding environments. Numerous publications focus on recognizing a small number of individual activities from body motion features with pattern recognition models, and are satisfied with small improvements of recognition rate. Furthermore, methods employed in these investigations are far from being suitable to be used in real cases considering the complexity of human society. In order to address the issue, more attention should be paid on cognition level rather than feature level. In fact, for a deeper understanding of social behavior, there is a need to study its semantic meanings against the social contexts, known as social interaction understanding. A framework for detecting social interaction needs to be established to initiate the study. In addition to individual body motions, more factors, including body motions, social roles, voice, related objects, environment, and other individuals' behaviors were added to the framework. To meet the needs, this dissertation study proposed a 4-layered hierarchical framework to mathematically model social interactions, and then explored several challenging applications based on the framework to demonstrate the great value of the study. There are no existing multimodality social interaction datasets available for this research. Thus, in Research Topic I, two typical scenes were created with a total of 24 takes (a take means a shot for a scene) as social interaction dataset. Topic II introduced a 4-layered hierarchical framework of social interactions, which contained 1) feature layer, 2) simple behavior layer, 3) behavior sequence layer, and 4) pairwise social interaction layer, from down to top. The top layer eventually generated two persons' joint behaviors in the form of descriptions with semantic meanings. To deal with the recognition within each layer, different statistical models were adopted. In Topic III, three applications based on the social interaction framework were presented, including social engagement, interesting moment, and visualization. The first application measured how strong the interaction was between an interaction pair. The second one detected unusual (interesting) individual behaviors and interactions. The third application aimed to better visually represent data so that users can get access to useful information quickly. All experiments in Research Topic II and III were based on the social interaction dataset created for the study. Performance of different layers was evaluated by comparing the experiment results with those of existing literature. The framework was demonstrated to be able to successfully capture and model certain social interactions, which can be applied to other situations. The pairwise social interaction layer generated joint behaviors with high accuracy because of the coupling nature of the model. Exploration on social engagement, interesting moments, and visualization shows great practical value of the current research may stimulate discussions and intrigue more research studies in the area.

Committee:

William Wee, Ph.D. (Committee Chair); Raj Bhatnagar, Ph.D. (Committee Member); Chia Han, Ph.D. (Committee Member); Anca Ralescu, Ph.D. (Committee Member); Xuefu Zhou, Ph.D. (Committee Member)

Subjects:

Computer Engineering

Keywords:

Human behavior understanding;Social interaction;Machine learning;Computer vision;Interesting moment;Social engagement

Han, QiangOn Resilient System Testing and Performance Binning
PhD, University of Cincinnati, 2015, Engineering and Applied Science: Computer Science and Engineering
By allowing timing errors to occur and recovering them on-line, resilient systems are designed to eliminate the frequency or voltage margin to improve circuit performance or reduce power consumption. With the existence of error detection and correction circuits, resilient systems bring about new timing constraints for path delay testing. With the characteristics of allowing timing errors to occur and recovering them on-line, the metrics of resilient system performance are different from traditional circuits, which results in new challenges on resilient system performance binning. Due to these new characteristics of resilient systems, it is essential to develop new testing and binning methodologies for them. In this research, we focus on resilient system testing and performance binning, and attempt to push forward the pace of resilient system commercialization. We make the following contributions. First, we propose a new DFT (design-for-testability) technique, which is able to deal with all different types of timing faults existing in resilient systems, and we develop an efficient test method based on binary search for error collection circuits. Then, a performance binning method based on structural at-speed delay testing is developed for resilient systems to greatly save the binning cost, and an adaptive clock configuration technique is proposed for yield improvement. Last but not least, we propose a new statistical performance analysis tool for resilient systems, called SERA (statistical error rate analysis), which takes process variations into consideration for error rate analysis and produces performance distribution function. With the help of SERA, we develop a profit-oriented binning methodology for resilient systems.

Committee:

Wen-Ben Jone, Ph.D. (Committee Chair); Chien-In Henry Chen, Ph.D. (Committee Member); Harold Carter, Ph.D. (Committee Member); Carla Purdy, Ph.D. (Committee Member); Ranganadha Vemuri, Ph.D. (Committee Member)

Subjects:

Computer Engineering

Keywords:

Resilient computing;Delay testing;Performance binning;Yield improvement;Error rate modeling;Statistical analysis

Jangid, AnuradhaVerifying IP-Cores by Mapping Gate to RTL-Level Designs
Master of Sciences (Engineering), Case Western Reserve University, 2013, EECS - Computer Engineering
Since 1965, with the invention of Integrated Circuit (IC) devices, the number of transistors on ICs has doubled every two years, as predicted by Moore. Today, the scaling of digital ICs has reached a point where it contains billions of interconnected transistors. As anticipated by International Technology Roadmap for Semiconductors (ITRS), mass production of silicon will contain over 6.08 billion transistors per chip by 2014, based on 14nm design technology standards. This humongous density of transistors places immense pressure on verification of IC designs at each stage of silicon development. Hardware Verification is the process of validating the correctness of a design implemented from the design specs. It accounts to nearly 70% - 80% of the total efforts in an IC development process. To validate the implementation, a typical silicon development cycle includes functional, logic and layout verifications processes. Therefore, it is desirable to incorporate a standard verification methodology which can certify point to point symmetry between the designs at different abstraction levels. Moreover, if such a methodology is applied, it would facilitate early detection of hardware defects which might arise from design synthesis, thereby, reducing the verification efforts in silicon development. In our work, we introduce a novel technique to verify the implementation of an IC at different design phases. Our technique is based on mapping of design models, by using Distinguishing Experiment, Distinguishing Sequence Generation, Simulation and Automatic Test Pattern Generation (ATPG). ATPG produces input sequences; such that when these sequences are applied on a pair of gates from a circuit, they generate different logic values at their corresponding outputs. Both designs are simulated with these input sequences and based on the simulation results, a distinguishing tree is constructed. Our technique utilizes a recursive simulation approach where feedback to distinguishing sequence generation module is provided by the tree after each simulation. Intelligence drawn from distinguishing tree states correspondence or mismatch between designs. A System on Chip (SoC) is an IC design, containing wide range of Intellectual Property (IP) cores. Verifying the equivalency of these IP cores at different abstraction levels, such as - Register Transfer Level (RTL) and gate-level, is extremely important. Our approach requires examination of gate-level design and its equivalent RTL-level design to identify the correspondence between gates and wires/variables. For the implementation, we are proposing an algorithm which accepts the gate and an RTL level circuits, matches the wires/variables in RTL-level design to the gates in gate level-design and identifies the location(s) where the two descriptions differ (if any) from each other. Similarly, a mapping of gates from Gate-level and transistors (pMOS, nMOS) from layout-level design can be established. Our methodology is applicable to both combinational and sequential designs. We designed an algorithm based on the Time Frame Expansion concept in sequential ATPG. This algorithm generates distinguishing input sequence for both classes of circuits. We have used several heuristics to improvise our ATPG algorithm in terms of speed, efficiency, for example; loop avoidance, controllability to select objective and guide backtrack, unreachable state, etc. For asserting our approach, we have performed various experiments on standard designs, which include ALU, USB 2.0 and Open RISC 1200, wherein we have successfully established a correspondence between the designs. Also, we have introduced several variances in both the designs and carried out experiments to identify those differences and to evaluate the precision and efficiency of our approach.

Committee:

Daniel G Saab (Advisor)

Subjects:

Computer Engineering; Design; Systems Design

Keywords:

Distinguishing Sequence, Distinguishing Experiments, Distinguishing Tree Generation

Konda, NiranjanEvaluating a Reference Enterprise Application as a Teaching Tool in a Distributed Enterprise Computing Class
Master of Science, The Ohio State University, 2013, Computer Science and Engineering
The difference between academic projects in colleges and enterprise projects in industry is quite significant. Students working on academic projects are not accustomed to enterprise applications: their challenging technology stacks and complex architectures and, do not follow most of the industry best practices and techniques that accompany typical enterprise applications. When they transition to industry they face a steep learning curve which could be quite challenging. The idea is to build a reference enterprise application that could demonstrate to students complex technology stacks and architectures, industry best practices like supporting documentation and code commenting along with techniques like logging. To effectively demonstrate the above mentioned goals, firstly, a reference enterprise application, the Buckeye Job Portal, was created. Later, a course lecture was created demonstrating all the various technologies in use in the application. Learning outcomes were determined based on the Bloom’s taxonomy classification and an experimental protocol was created, that contained specific, hands-on tasks related to the application keeping in mind these learning outcomes. A sample student group was provided with the course lecture and made to work on the experimental protocol. Observations were made during the process and feedback collected. Results showed that, in general, the student group was successful in creating a new development environment along with importing and running the existing reference Buckeye Job Portal enterprise application. They were also able to modify and add new functionality to the application thereby demonstrating a good grasp of all the enterprise application concepts.

Committee:

Rajiv Ramnath, Dr. (Advisor); Jay Ramanathan, Dr. (Committee Member)

Subjects:

Computer Engineering

Keywords:

Reference Distributed Enterprise Computing Teaching Tool Application

Van Hook, Richard L.A Comparison of Monocular Camera Calibration Techniques
Master of Science in Computer Engineering (MSCE), Wright State University, 2014, Computer Engineering
Extensive use of visible electro-optical (visEO) cameras for machine vision techniques shows that most camera systems produce distorted imagery. This thesis investigates and compares several of the most common techniques for correcting the distortions based on a pinhole camera model. The methods being examined include a common chessboard pattern based on (Sturm 1999), (Z. Zhang 1999), and (Z. Zhang 2000), as well as two "circleboard" patterns based on (Heikkila 2000). Additionally, camera models from the visual structure from motion (VSFM) software (Wu n.d.) are used. By comparing reprojection error from similar data sets, it can be shown that the asymmetric circleboard performs the best. Finally, a software tool is presented to assist researchers with the procedure for calibration using a well-known fiducial.

Committee:

Kuldip Rattan, Ph.D. (Advisor); Juan Vasquez, Ph.D. (Committee Member); Thomas Wischgoll, Ph.D. (Committee Member)

Subjects:

Computer Engineering; Computer Science; Optics; Scientific Imaging

Keywords:

Calibration; Reprojection Error; Visual Structure from Motion; OpenCV; calibration patterns

Sun, XinyuFault Modeling and Fault Type Distinguishing Test Methods for Digital Microfluidics Chips
MS, University of Cincinnati, 2013, Engineering and Applied Science: Computer Engineering
Physical defects in digital microfludics chips (DMCs) can be very complicated and extremely difficult to find precise models, because each defect may occur anywhere. In this thesis, we develop high-level abstract fault models based on investigating the faulty and fault-free behaviors of droplet moving. Two new fault models that were not found previously are proposed to enhance the reliability of DMCs. We believe that the high-level fault models can completely cover all defects involving two cells in a DMC array. Based on the new high-level fault models, we propose march algorithms (march-d and march-p/p+) to generate test patterns that can detect and distinguish fault types for each faulty digital microfludics chip. This is accomplished by merging both march-d and part of march-p without causing too much test length increase. These algorithms are implemented into a FPGA board attached to the simulated digital microfluidics chip such that built-in self-test can be accomplished without human intervention. We also develop an EDA tool and simulation platform for the proposed DMC-BIST system. Experimental results demonstrate that the proposed fault models, test and fault type distinguishing methods, built-in self-test circuit design, and emulation tool can effectively and efficiently achieve high quality test with minimal test cost.

Committee:

Wen Ben Jone, Ph.D. (Committee Chair); Xingguo Xiong, PhD (Committee Member); Ian Papautsky, Ph.D. (Committee Member)

Subjects:

Computer Engineering

Keywords:

Fault modeling;Test methods;BIST;Digital microfluidics chip;Microfluidics EDA;March algorithm

Ford, Gregory FickHardware Emulation of Sequential ATPG-Based Bounded Model Checking
Master of Sciences (Engineering), Case Western Reserve University, 2014, EECS - Computer Engineering
The size and complexity of integrated circuits is continually increasing, in accordance with Moore’s law. Along with this growth comes an expanded exposure to subtle design errors, thus leaving a greater burden on the process of formal verification. Existing methods for formal verification, including Automatic Test Pattern Generation (ATPG) are susceptible to exploding model sizes and run times for larger and more complex circuits. In this paper, a method is presented for emulating the process of sequential ATPG-based Bounded Model Checking on reconfigurable hardware. This achieves a speed up over software based methods, due to the fine-grain massive parallelism inherent to hardware.

Committee:

Daniel Saab (Committee Chair); Francis Merat (Committee Member); Christos Papachristou (Committee Member)

Subjects:

Computer Engineering

Keywords:

Formal verification, ATPG, BMC, PODEM

Tsitsoulis, AthanasiosA Methodology for Extracting Human Bodies from Still Images
Doctor of Philosophy (PhD), Wright State University, 2013, Computer Science and Engineering PhD
Monitoring and surveillance of humans is one of the most prominent applications of today and it is expected to be part of many future aspects of our life, for safety reasons, assisted living and many others. Many efforts have been made towards automatic and robust solutions, but the general problem is very challenging and remains still open. In this PhD dissertation we examine the problem from many perspectives. First, we study the performance of a hardware architecture designed for large-scale surveillance systems. Then, we focus on the general problem of human activity recognition, present an extensive survey of methodologies that deal with this subject and propose a maturity metric to evaluate them. One of the numerous and most popular algorithms for image processing found in the field is image segmentation and we propose a blind metric to evaluate their results regarding the activity at local regions. Finally, we propose a fully automatic system for segmenting and extracting human bodies from challenging single images, which is the main contribution of the dissertation. Our methodology is a novel bottom-up approach relying mostly on anthropometric constraints and is facilitated by our research in the fields of face, skin and hands detection. Experimental results and comparison with state-of-the-art methodologies demonstrate the success of our approach.

Committee:

Nikolaos Bourbakis, Ph.D. (Advisor); Soon Chung, Ph.D. (Committee Member); Yong Pei, Ph.D. (Committee Member); Ioannis Hatziligeroudis, Ph.D. (Committee Member)

Subjects:

Computer Engineering; Computer Science

Keywords:

image segmentation metric; human activity recognition; human body segmentation; monitoring and surveillance

Gideon, JohnThe Integration of LlamaOS for Fine-Grained Parallel Simulation
MS, University of Cincinnati, 2013, Engineering and Applied Science: Computer Engineering
LlamaOS is a custom operating system that provides much of the basic functionality needed for low latency applications. It is designed to run in a Xen-based virtual machine on a Beowulf cluster of multi/many-core processors. The software architecture of llamaOS is decomposed into two main components, namely: the llamaNET driver and llamaApps. The llamaNET driver contains Ethernet drivers and manages all node-to-node communications between user application programs that are contained within a llamaApp instance. Typically, each node of the Beowulf cluster will run one instance of the llamaNET driver with one or more llamaApps bound to parallel applicaitons. These capabilities provide a solid foundation for the deployment of MPI applications as evidenced by our initial benchmarks and case studies. However, a message passing standard still needed to be either ported or implemented in llamaOS. To minimize latency, llamaMPI was developed as a new implementation of the Message Passing Interface (MPI), which is compliant with the core MPI functionality. This provides a standardized and easy way to develop for this new system. Performance assessment of llamaMPI was achieved using both standard parallel computing benchmarks and a locally (but independently) developed program that executes parallel discrete event-driven simulations. In particular, the NAS Parallel Benchmarks are used to show the performance characteristics of llamaMPI. In the experiments, most of the NAS Parallel Benchmarks ran faster than, or equal to their native performance. The benefit of llamaMPI was also shown with the fine-grained parallel application WARPED. The order of magnitude lower communication latency in llamaMPI greatly reduced the amount of time that the simulation spent in rollbacks. This resulted in an overall faster and more efficient computation, because less time was spent off the critical path due to causality errors.

Committee:

Philip Wilsey, Ph.D. (Committee Chair); Fred Beyette, Ph.D. (Committee Member); Carla Purdy, Ph.D. (Committee Member)

Subjects:

Computer Engineering

Keywords:

Parallel Computing;Time Warp Simulation;MPI;Operating Systems;Beowulf Cluster;Parallel Discrete Event Simulation

Franzese, Anthony L.Real-time Location with ZigBee Hardware
MS, University of Cincinnati, 2011, Engineering and Applied Science: Computer Engineering

Mechanisms for tracking assets and inventory management are widespread and well-developed. Tracking is achieved by attaching “tags” with unique identifiers to assets and deploying “readers” throughout a facility to read the identity of the tagged assets. In general the tools and solutions for asset tracking are organized into one of two categories, namely: passive (RFID or optical barcode) solutions and real-time location systems. Passive solutions provide coarse-grained location services that record a tracked item’s movement past fixed position “reader” devices. Asset movement from location to location and into and out-of a facility are recorded. Passive systems are highly effective for inventory control and management and they are pervasive in the consumer products markets. In contrast, Real-Time Location Systems (RTLS) provide pin-point location services that can identify an asset’s location at all times. RTLS systems generally require a much larger number of expensive readers distributed throughout the monitored facility to ensure continuous communication with the tags and to allow triangulation services to precisely locate the tagged assets. Thus, existing asset tracking systems provide either inexpensive coarse-grained location services (passive solutions) or high-cost pin-point accuracy services (RTLS solutions).

In many cases, the requirements for real-time asset tracking solutions do not require pin-point accuracy or continuous, second-by-second location service. For example, a solution tracking assets every 30 seconds to a coarse-grained location on the accuracy of 50-100 feet would be more than sufficient for locating wheelchairs or baggage carts in an airport, beds in a hospital, or baggage carts in a hotel. Passive solutions are ineffective because the readers can generally read only short distances (15 feet maximum) and RTLS solutions are far too expensive to be deployed throughout an airport or large scale facility such as a major hospital. This thesis examines the design of a coarse-grained asset tracking solution suitable for the needs of tracking wheelchairs in airports. The solution must be low-cost, self-organizing, and inexpensive. In this work a solution using ZigBee networking hardware is developed and analyzed. The result provides a solution where the tags are small enough to fit comfortably on wheelchairs and baggage carts and they can provide identifying broadcast signaling for at least one year using two AA batteries. The technology provides a self-organizing network where readers can be placed at reasonable distances (100-200 feet) from one another and that can provide asset tracking coverage over the largest airports in the world.

Committee:

Philip Wilsey, PhD (Committee Chair); Fred Beyette, PhD (Committee Member); Carla Purdy, C, PhD (Committee Member)

Subjects:

Computer Engineering

Keywords:

ZigBee; RTLS; Asset Tracking; RFID; Real-time Locating Systems; Sensor Network

Nagavaram, AshishCloud Based Dynamic Workflow with QOS For Mass Spectrometry Data Analysis
Master of Science, The Ohio State University, 2011, Computer Science and Engineering

Lately, there is a growing interest in the use of cloud computing for scientific applications, including scientific workflows. Key attractions of cloud include the pay-as-you-go model and elasticity. While the elasticity offered by the clouds can be beneficial for many applications and use-scenarios, it also imposes significant challenges in the development of applications or services. For example, no general framework exists that can enable a scientific workflow to execute in a dynamic fashion with QOS (Quality of Service) support, i.e. exploiting elasticity of clouds and automatically allocating and de-allocating resources to meet time and/or cost constraints while providing the desired quality of results the user needs.

This thesis presents a case-study in creating a dynamic cloud workflow implementation with QOS of a scientific application. We work with MassMatrix, an application which searches proteins and peptides from tandem mass spectrometry data. In order to use cloud resources, we first parallelize the search method used in this algorithm. Next, we create a flexible workflow using the Pegasus Workflow Management System from ISI. We then add a new dynamic resource allocation module, which can use fewer or a larger number of resources based on a time constraint specified by the user. Finally we extend this to include the QOS support to provide the user with the desired quality of results. We use the desired quality metric to calculate the values of the application parameters. The desired quality metric refers to the parameters that are computed to maximize the user specified benefit function while meeting the time constraint. We evaluate our implementation using several different data-sets, and show that the application scales quite well. Our implementation effectively allocates resources adaptively and the parameter prediction scheme is successful in choosing parameters that help meet the time constraint.

Committee:

Gagan Agrawal, PhD (Advisor); Rajiv Ramnath, PhD (Committee Member); Michael Freitas, PhD (Committee Member)

Subjects:

Bioinformatics; Biomedical Engineering; Biomedical Research; Computer Engineering; Computer Science

Keywords:

cloud;dynamic workflow;adaptive execution on cloud;parallelization on cloud;time constraint execution;QOS on cloud;parameter prediction;parameter modeling

Jayaram, IndiraAdding non-traditional constraints to the embedded systems design process
MS, University of Cincinnati, 2011, Engineering and Applied Science: Computer Engineering
Embedded systems are ubiquitous and have a large number of applications. The requirements for embedded systems are not restricted to functionality but also include a lot of non-functional properties such as cost, reliability, safety, ease of use etc. This makes developing a standard design methodology for embedded systems challenging. In this thesis, we are attempting to include the non-traditional, non-functional constraints of embedded systems in the design process by weighting them in the order of their importance. We propose developing UML models for a system and annotating them with the non-functional constraints by using standard profile extensions and weighted constraint charts. We demonstrate the application of this design technique by developing a few example systems. One of the systems is implemented on Altera UP3 platform and demonstrates how the design technique leads us to choose the implementation that satisfies all the requirements, including the ones that are non-functional.

Committee:

Carla Purdy,, PhD (Committee Chair); Philip Wilsey, PhD (Committee Member); Xuefu Zhou, PhD (Committee Member)

Subjects:

Computer Engineering

Keywords:

Embedded systems;UML;MARTE

Next Page