Department: Computer Science and Engineering PhD ![Remove this limiter [clear]](close-x.png)
43 matches in the database.
These are records: 1 - 30.
[1] [2]

1.
Anderson, Paul Edward.
ALGORITHMIC TECHNIQUES EMPLOYED IN THE QUANTIFICATION AND CHARACTERIZATION OF NUCLEAR MAGNETIC RESONANCE SPECTROSCOPIC DATA.
Degree: PhD, Computer Science and Engineering PhD, 2010, Wright State University
► Nuclear magnetic resonance (NMR) based metabolomics is a developing research field with…
(more)
▼ Nuclear magnetic resonance (NMR) based metabolomics is a developing research field with broad applicability, including the identification of biomarkers associated with pathophysiologic changes, sample classification based on the mechanism of toxicity, and clinical diagnosis. Intrinsic to these applications is the need for statistical and computational techniques to facilitate the associated data analysis. Further, a typical 1H NMR spectrum of pure proteins, biofluids, or tissue may contain thousands of resonances (i.e., peaks), thus, a pure visual inspection is insufficient to fully utilize the spectral information. Common practice within the NMR-based metabolomics community is to evaluate and validate novel algorithms on empirical and simplified simulated data. Empirical data captures the complex characteristics of experimental data; however, evaluations on empirical data often rely on indirect performance metrics because the optimal or correct output is difficult to obtain a priori. To overcome the drawback of relying on indirect performance metrics, researchers often evaluate their algorithms on simplified simulated data. The conclusions derived from this type of data can be difficult to generalize to true experimental data. This dissertation combines the advantages of both empirical and simplified simulated data by generating exacting synthetic data sets that emulate the salient features of experimental data. The analysis of NMR metabolic spectroscopic data can be divided into four steps: (1) standard post-instrumental processing of spectroscopic data; (2) quantification of spectral features; (3) normalization and scaling; and (4) multivariate statistical modeling of data. Quantification of spectral features, step (2), is a key step in the development of classification algorithms and biomarker identification (i.e., pattern recognition). Algorithms for spectral quantification are designed to enhance the efficacy of pattern recognition and multivariate statistical techniques for metabolomics. This is accomplished by reducing the dimensionality of the spectra, while retaining salient information and mitigating peak misalignment. This dissertation develops two novel spectral quantification techniques: Gaussian binning and dynamic adaptive binning. Gaussian binning utilizes a kernel-based binning algorithm to decrease the sensitivity to peak misalignment. Dynamic adaptive binning optimizes the bin boundaries through an objective function using a dynamic programming strategy. Both Gaussian binning and dynamic adaptive binning are compared to common spectral binning techniques by analyzing their ability to reduce the probability of peaks spanning bin boundaries and increase the interpretability of the results. Finally, a case study is presented to show the ability of dynamic adaptive binning and Gaussian binning to enhance the analysis of a 1H NMR-based experiment to monitor rat urinary metabolites following exposure to the toxin α-naphthylisothiocyanate.
Advisors/Committee Members: Raymer, Michael.
Subjects: Computer science
Keywords: Metabolomics; Metabonomics
More Like This

2.
Bender, Paul Anthony.
Energy efficient Image Video Sensor Networks.
Degree: PhD, Computer Science and Engineering PhD, 2008, Wright State University
► Image Video Sensor Networks are emerging applications for sensor network technologies. The…
(more)
▼ Image Video Sensor Networks are emerging applications for sensor network technologies. The relatively large size of the data collected by image video sensors presents new challenges for the sensor network in terms of energy consumption and channel capacity. We address each of these issues through the use of a high density network deployment utilizing some nodes as dedicated relay nodes.A high density network allows network nodes to reduce their transmission power. This reduction in transmission power allows each node to conserve power and simultaneously increases the potential for spatially concurrent transmissions within the network, resulting in improved network throughputs. The use of additional relay nodes may further increase the potential for such spatially concurrent transmissions, without increasing the relay burden for each node by maintaining the same number of data generating sources in the network. In this work, we show analytically how a high density network effects energy consumption and network capacity. We discuss the constraints placed on a high density sensor network deployment due to application latency requirements, sensor coverage requirements, connectivity requirements, and node costs. Furthermore, we implement an Image/Video Sensor Web, an Internet enabled testbed for studying the implementation of a high density network deployment for Image/Video Sensor Networks. We utilize this testbed to verify our analytical energy results, and to study the reliable data delivery requirements necessary to successfully deploy an Image/Video Sensor Network.
Advisors/Committee Members: Pei, Yong.
Subjects: Computer science
Keywords: Sensor Networks; Multimedia; Energy Efficiency
More Like This

3.
Boddhu, Sanjay Kumar.
Evolution and Analysis of Neuromorphic Flapping-Wing Flight Controllers.
Degree: PhD, Computer Science and Engineering PhD, 2010, Wright State University
► The control of insect-sized flapping-wing micro air vehicles is attracting increasing interest.…
(more)
▼ The control of insect-sized flapping-wing micro air vehicles is attracting increasing interest. Solution of the problem requires construction of a controller that is physically small, extremely power efficient, and capable. In addition, process variation in the creation of very small wings and armatures as well as the potential for accumulating damage and wear over the course of a vehicle's lifetime suggest that controllers be able to self-adapt to the specific and possibly changing nature of the vehicles in which they are embedded. Previous work with Evolvable Hardware Continuous Time Recurrent Neural Networks (CTRNNs) as applied to adaptive control of walking in legged robots suggests that CTRNNs may provide a suitable control solution for flapping-wing micro air vehicles. However, upon complete analysis, it can be seen that perceived similarities between the two problems are somewhat superficial, and that flapping-wing vehicle control requires its own study. This dissertation constitutes the first attempt to apply evolved CTRNN devices to the control of a feasible flapping-wing micro air vehicle. It is organized as a sequence of control experiments of increasing difficulty and explores the following issues, development of behavior-based analog circuit modules, architectures to combine those modules into multi-functional controllers, low-level circuit analyses to explain how evolved modules operate and interact. Also included are experiments in the creation of physically polymorphic behavior modules that combine multiple flight functions into a monolithic analog device. In addition to providing first-of-its-kind feasibility results, this dissertation develops a new frequency-grouping based analysis method to explain the operation of evolved devices.
Advisors/Committee Members: Gallagher, John.
Subjects: Computer science
Keywords: Flapping Wing Flight Control, Neuromorphic Control, CTRNN-EH Control, Evolvable Hardware, Evolutionary Robotics
More Like This

4.
Camp, John L.
3-D Model Characterization and Identification from Intrinsic Landmarks.
Degree: PhD, Computer Science and Engineering PhD, 2011, Wright State University
► A method to automatically characterize and identify 3-D range scans based on…
(more)
▼ A method to automatically characterize and identify 3-D range scans based on intrinsic landmarks is presented. Intrinsic landmarks represent locally unique, intrinsic properties of a scanned surface, regardless of scale or rotation. The number, location, and characteristics of landmarks are used to characterize the scanned models. This method contains a selection process to identify stable, intrinsic landmarks for range scans as well as the identification of those scans. The selection process requires no user interaction or surface assumptions. It uses the principal curvatures at the range points to select the landmarks. First, a large number of landmarks are generated by fitting a bi-cubic polynomial surface to points surrounding each range point and calculating the principal curvatures at the range point. Points of locally extremum principal curvature are then considered candidate landmarks. Using a random sample and consensus (RANSAC) algorithm, candidate landmarks that match with landmarks in other scans of the same subject are selected as final, stable landmarks. Our main goal is to provide a means to characterize models in a range data base. With several scans of each subject available in the data base, a number of stable landmarks are determined for each subject. The locations and characteristics of the landmarks are used to describe a subject and distinguish it from other subjects. The main contribution of this work is considered to be the selection of unique and stable landmarks in a range scan and generation of a descriptor for each landmark that characterizes the intrinsic properties of the surface in the neighborhood of the landmark. The effectiveness of the method is presented through the successful identification of processed subjects and characterization of new subjects.
Advisors/Committee Members: Goshtasby, Arthur.
Subjects: Computer Engineering
Keywords: intrinsic landmarks; shape descriptor; principal curvature; 3D Models; Landmark Selection; Range Scans; Intrinsic Property
More Like This

5.
Chen, Lijun.
SUMMARITIVE DIGEST FOR LARGE DOCUMENT REPOSITORIES WITH APPLICATION TO E-RULEMAKING.
Degree: PhD, Computer Science and Engineering PhD, 2007, Wright State University
► Large document repositories need to be organized and summarized to make them…
(more)
▼ Large document repositories need to be organized and summarized to make them more accessible and understandable. Such needs exist in many applications, including web search, e-rulemaking (electronic rulemaking) and document archiving. Even though much has been done in the areas of document clustering and summarization, there are still many new challenges and issues that need to be addressed as the repositories become larger, more prevalent and dynamic. In this dissertation, we investigate more informative ways to organize and summarize large document repositories, especially e-rulemaking feedback repositories (ERFRs), so that the large repositories can be managed and digested more efficiently and effectively. Specifically, we mainly consider the following four tasks: 1) identifying important aspects of ERFR, 2)constructing cluster descriptions for document clustering, 3) clustering of ERFR with simultaneous construction of succinct cluster descriptions, and 4) selecting representative arguments for ERFR clustering. We propose to organize and summarize e-rulemaking feedbacks based on three different major aspects of the rulemaking process, in order to meet the different needs of the rule-writers or analysts; the three aspects are: opinions (O), issues (I) and stakeholders (S). We introduce an OIS-based approach to producing informative summaritive digest (SD) for given ERFRs. In addition, several novel concepts, approaches and algorithms are introduced, including the CDD measure, active feature selection (AFS), Pagoda search algorithms, etc. An SD, simply put, consists of a document clustering, along with certain succinct cluster descriptions (SCDs) and representative arguments (RAs) for each cluster in the clustering. The clustering of an SD can be constructed in either a flat or hierarchical manner. For hierarchical clustering, each level of the hierarchy can be constructed by emphasizing one of the O, I, and S aspects. Different orders of O, I and S can be used for the levels of the hierarchy. Different clusterings could be used to meet the needs of different users. Given a goodness measure, a "best" clustering can be recommended to the user. An SCD consists of a set of carefully selected terms along with some statistics, and the RAs are some typical arguments selected from each cluster. An RA should be a statement where certain major stakeholders have expressed opinions on some of the important issues. Collectively, an SD provides an informative navigation aid for the rule-writers and analysts to manage and digest large ERFRs. We conduct an experimental evaluation on our approaches by using some publicly available ERFRs. The results suggest that the SD not only helps user for "browsing" the feedbacks, but also gives the users some high-level sense about the feedbacks before they dig into each individual comment. The results also show that our approaches are efficient and scalable for managing large document repositories. Even though we devoted special attention to the application of e-rulemaking, we believe that most of the ideas are very generic and can be easily applied to other types of repositories, including digital archives.
Advisors/Committee Members: Dong, Guozhu.
Subjects: Computer Science
Keywords: clustering; cluster; different; SCDs; CDs; ERFRs; PagodaCD
More Like This

6.
Chivers, Daniel Stephen.
Human Action Recognition by Principal Component Analysis of Motion Curves.
Degree: PhD, Computer Science and Engineering PhD, 2012, Wright State University
► Human action recognition is used to automatically detect and recognize actions per-…
(more)
▼ Human action recognition is used to automatically detect and recognize actions per- formed by humans in a video. Applications include visual surveillance, human-computer interaction, and robot intelligence, to name a few. An example of a surveillance application is a system that monitors a large public area, such as an airport, for suspicious activity. In human-machine interaction, computers may be controlled by simple human actions. For example, the motion of an arm may instruct the computer to rotate a 3-D model that is being displayed. Human action recognition is also an important capability of intelligent robots that interact with humans. General approaches to human action recognition fall under two categories: those that are based on tracking and those that do not use tracking. Approaches that do not use tracking often cannot recognize complex motions where movement of different parts of the body is important. Tracking-based approaches that use motion of different parts of the body are generally more powerful but are computationally more expensive, making them inappropriate for applications that require real-time responses. We propose a new approach to human action recognition that is able to learn various human actions and later recognize them in an efficient manner. In this approach, motion trajectories are formed by tracking one or more key points on the human body. In particular, points on the hands and feet are tracked. A curve is fitted to each motion trajectory to smooth noise and to form a continuous and differentiable curve. A motion curve is then segmented into “basic motion” segments by detecting peak curvature points. To recognize an observed basic motion, a vector of curve features describing the motion is created, the vector is projected to the eigenspace created during PCA training, and the action most similar to a learned action is identified using the k-nearest neighbor decision rule. The proposed approach simplifies action recognition by requiring that only a small number of points on a subject's body be tracked. It is shown that the motion curves obtained by tracking a small number of points are sufficient to recognize various human actions with a high degree of accuracy. Furthermore, the proposed approach can improve the recognition power of other ap- proaches by recognizing detailed basic motions, such as foot steps, while introducing ef- ficient tracking and recognition compared to previous approaches. Recognition of basic motions allows a high-level recognizer to recognize more complex or composite actions by using the proposed system as a low-level recognizer. Contributions of this work include reducing each video frame to a few key points on the subject's body, using curve fitting to smooth trajectory data and provide reliable seg- mentation of the motion, and efficient recognition of basic motions using PCA.
Advisors/Committee Members: Goshtasby, Arthur.
Subjects: Computer Engineering; Computer Science
Keywords: Human Action Recognition; Motion Analysis; Principal Component Analysis
More Like This

7.
Cooper, Gina Marie.
IMPROVING REMOTE HOMOLOGY DETECTION USING A SEQUENCE PROPERTY APPROACH.
Degree: PhD, Computer Science and Engineering PhD, 2009, Wright State University
► Understanding the structure and function of proteins is a key part of…
(more)
▼ Understanding the structure and function of proteins is a key part of understanding biological systems. Although proteins are complex biological macromolecules, they are made up of only 20 basic building blocks known as amino acids. The makeup of a protein can be described as a sequence of amino acids. One of the most important tools in modern bioinformatics is the ability to search for biological sequences (such as protein sequences) that are similar to a given query sequence. There are many tools for doing this (Altschul et al., 1990, Hobohm and Sander, 1995, Thomson et al., 1994, Karplus and Barrett, 1998). Most of these tools, however, focus on closely related, or homologous, sequences. Distantly related proteins sequences (remote homologs) are of interest to biologists but remain notoriously difficult to find. This dissertation presents a novel method for finding remote homologs in databases of protein sequences. In this method, proteins are characterized according to physiochemical and sequence-based features. Features are then weighted according to their utility in identifying distantly related protein sequences. The feature weights are optimized by a custom genetic algorithm. Position-specific-scoring matrices are used to further increase the ability of the tuned algorithm to generalize its search capability to new sequences. The resulting search method outperforms the most well-known techniques for finding distant homologs, both in terms of accuracy and computation time.
Advisors/Committee Members: Raymer, Michael.
Subjects: Bioinformatics; Computer science
Keywords: bioinformatics; sequence search; database search
More Like This

8.
Dakopoulos, Dimitrios.
TYFLOS: A WEARABLE NAVIGATION PROTOTYPE FOR BLIND & VISUALLY IMPAIRED; DESIGN, MODELLING AND EXPERIMENTAL RESULTS.
Degree: PhD, Computer Science and Engineering PhD, 2009, Wright State University
► The need for assistive devices has, had and will have a large…
(more)
▼ The need for assistive devices has, had and will have a large merit in many engineering research arenas. This dissertation deals with the design, modeling, implementation and experimentation of the navigation component of a wearable assistive system for blind and visually impaired people, called TYFLOS (ΤΥΦΛΟΣ) which is the Greek word for “Blind”.The current prototype consists of two mini cameras attached to a pair of conventional eye-glasses, a 2D tactile display vibration array) which consists of 16 vibrating elements arranged in a 4x4 manner, attached to an elastic vest worn on the user‟s abdomen, a portable computer, an ear-speaker and a microphone. The Tyflos Navigator is an Electronic Travel Aid (ETA) with primary goal to help users towards their independent mobility in indoor environment. Its main sensor unit, the stereo vision system, captures environmental information from the user's field-of-view. 3D representations are created and moving objects are identified using stereoscopic vision and motion detection methodologies. The high resolution output of the methodologies is projected on the low resolution vibration array via a high-to-low methodology based on navigation criteria and modeled with a formal language called Vibration Array Language (VAL). The spatial distribution and temporal characteristics (varying frequencies) of the vibrating elements of the vibration array can inform the user for safe navigation paths and obstacles, giving distance and location information. All parts of the system will be continuously adapted until the users' needs are fulfilled or the technological constraints are reached. A step towards that goal will be shown at the last part of this work with the development of a tactile vocabulary and the experimentation with users where they provide feedback giving us directions for refinements, changes and future work.
Advisors/Committee Members: Bourbakis, Nikolaos.
Subjects: Computer science; Engineering
Keywords: blind; visually impaired; wearable system; navigation; electronic travel aid; ETA; stereo vision; formal language; vibration array language; tactile vocabulary
More Like This

9.
Fan, Yuqi.
Burst Scheduling, Grooming and QoS Provisioning in Optical Burst-Switched Networks.
Degree: PhD, Computer Science and Engineering PhD, 2009, Wright State University
► The demand of network capacity has been increasing steadily with more users…
(more)
▼ The demand of network capacity has been increasing steadily with more users than ever connected to the Internet through broadband access and the popularity of video based applications, such as YouTube. Optical wavelength division multiplexing (WDM) networks are expected to form the next-generation backbone network and to fulfill the insatiable appetite for bandwidth.Wavelength routed WDM optical networks offer the granularity of switching at a fiber, waveband, or a wavelength level. The finest granularity offered is at a wavelength level by provisioning lightpaths for different clients/services. All-optical packet switching is still deemed technically infeasible and its competitiveness as a backbone technology is debatable. Optical burst switching (OBS) presents itself as a promising technology for bridging the gap between optical wavelength switching and optical packet switching. OBS operates at the sub-wavelength level and is designed to improve the bandwidth utilization of wavelengths by exploring statistical multiplexing to deal with bursty traffic, and is therefore more resource efficient than optical wavelength switching. In OBS networks, arriving data packets (e.g., IP packets) are assembled at the ingress OBS nodes to form a data burst. A burst control packet (CP) is sent on a control channel ahead of the data burst to reserve resources and configure the switches along the route traversed by the data burst. In this dissertation, we will explore several important and challenging issues in OBS networks in order to improve the utilization of network resource. To reduce the switching overhead, small bursts may be groomed to reduce resource waste and switching penalty. We have studied the per-hop burst grooming problem where bursts with the same next hop may be groomed together to minimize the number of formed larger bursts and strike a proper balance between burst grooming and grooming cost, assuming all the network nodes have the grooming capability. In order to reduce computation overhead and processing delay incurred at the core nodes, we assume that grooming can only be performed at edge nodes and the core node can send a burst to multiple downstream links, that is, the core node has light-splitting capability. We have attempted to groom small bursts into larger bursts, and select a proper route for each large burst, such that total network resources used and/or wasted for delivering the small bursts is minimized. Optical signal transmission quality is subject to various types of physical impairment introduced by optical fibers, switching equipment, or other network components. The signal degradation due to physical impairment may be significant enough such that the bit-error rate of received signals is unacceptably high at the destination, rendering the signal not usable. Based on earlier work, we have studied scheduling and QoS provisioning problems in OBS networks, taking physical impairments into consideration. In the context of the JET signaling protocol, we have studied the burst scheduling problem and proposed three effective burst scheduling algorithms in OBS networks, taking into account physical impairment effects. Because the offset time of bursts varies in OBS networks, the voids or fragmentation on the channels in the outgoing links can severely degrade the network throughput and blocking probability performance, if not dealt with carefully. A signalling architecture called Dual-header Optical Burst Switching (DOBS) is proposed to reduce the scheduling algorithm complexity. We study the burst scheduling problem and propose an impairment aware scheduling algorithm in DOBS networks. QoS provisioning is an important issue in OBS networks. We have dealt with relative QoS support problem and proposed a QoS provisioning algorithm subject to the physical impairment constraints. A high-priority burst requires a better quality of service in terms of blocking probability, and at the same time, the transmission of the burst should satisfy physical impairment constraints.
Advisors/Committee Members: Wang, Bin.
Subjects: Computer science
Keywords: Optical burst switching, Burst grooming, Edge node grooming, Burst scheduling, Physical impairment aware scheduling, Dual-header OBS, QoS provisioning
More Like This

10.
Gilder, Jason R.
Computational methods for the objective review of forensic DNA testing results.
Degree: PhD, Computer Science and Engineering PhD, 2007, Wright State University
► Since the advent of criminal investigations, investigators have sought a "gold standard"…
(more)
▼ Since the advent of criminal investigations, investigators have sought a "gold standard" for the evaluation of forensic evidence. Currently, deoxyribonucleic acid (DNA) technology is the most reliable method of identification. Short Tandem Repeat (STR) DNA genotyping has the potential for impressive match statistics, but the methodology not infallible. The condition of an evidentiary sample and potential issues with the handling and testing of a sample can lead to significant issues with the interpretation of DNA testing results. Forensic DNA interpretation standards are determined by laboratory validation studies that often involve small sample sizes. This dissertation presents novel methodologies to address several open problems in forensic DNA analysis and demonstrates the improvement of the reported statistics over existent methodologies. Establishing a dynamically calculated RFU threshold specific to each analysis run improves the identification of signal from noise in DNA test data. Objectively identifying data consistent with degraded DNA sample input allows for a better understanding of the nature of an evidentiary sample and affects the potential for identifying allelic dropout (missing data). The interpretation of mixtures of two or more individuals has been problematic and new mathematical frameworks are presented to assist in that interpretation. Assessing the weight of a DNA database match (a cold hit) relies on statistics that assume that all individuals in a database are unrelated – this dissertation explores the statistical consequences of related individuals being present in the database. Finally, this dissertation presents a statistical basis for determining if a DNA database search resulting in a very similar but nonetheless non-matching DNA profile indicates that a close relative of the source of the DNA in the database is likely to be the source of an evidentiary sample.
Advisors/Committee Members: Doom, Travis E.
Keywords: Bioinformatics; Forensics; PCR-STR DNA; Limit of detection; Degraded DNA; Mixture interpretation; DNA databases; Familial searching
More Like This

11.
Gomadam, Karthik Rajagopal.
Semantics Enriched Service Environments.
Degree: PhD, Computer Science and Engineering PhD, 2009, Wright State University
► During the past seven years services centric computing has emerged as the…
(more)
▼ During the past seven years services centric computing has emerged as the preferred approach to architect complex software. Software is increasingly developed by integrating remotely existing components, popularly called services. This architectural paradigm, also called Service Oriented Architecture (SOA), brings with it the benefits of interoperability, agility and flexibility to software design and development. One can easily add or change new features to existing systems, either by the addition of new services or by replacing existing ones. Two popular approaches have emerged for realizing SOA. The first approach is based on the SOAP protocol for communication and the Web Service Description Language (WSDL) for service interface description. SOAP and WSDL are built over XML, thus guaranteeing minimal structural and syntactic interoperability. In addition to SOAP and WSDL, the WS-* (WS-Star) stack or SOAP stack comprises other standards and specification that enable features such as security and services integration. More recently, the RESTful approach has emerged as an alternative to the SOAP stack. This approach advocates the use of the HTTP operations of GET/PUT/POST/DELETE as standard service operations and the REpresentational State Transfer (REST) paradigm for maintaining service states. The RESTful approach leverages on the HTTP protocol and has gained a lot of traction, especially in the context of consumer Web applications such as Maps. Despite their growing adoption, the stated objectives of interoperability, agility, and flexibility have been hard to achieve using either of the two approaches. This is largely because of the various heterogeneities that exist between different service providers. These heterogeneities are present both at the data and the interaction levels. Fundamental to addressing these heterogeneities are the problems of service Description, Discovery, Data mediation and Dynamic configuration. Currently, service descriptions capture the various operations, the structure of the data, and the invocation protocol. They however, do not capture the semantics of either the data or the interactions. This minimal description impedes the ability to find the right set of services for a given task, thus affecting the important task of service discovery. Data mediation is by far the most arduous task in service integration. This has been a well studied problem in the areas of workflow management, multi-database systems and services computing. Data models that describe real world data, such as enterprise data, often involve hundreds of attributes. Approaches for automatic mediation have not been very successful, while the complexity of the models require considerable human effort. The above mentioned problems in description, discovery and data mediation pose considerable challenge to creating software that can be dynamically configured. This dissertation is one of the first attempts to address the problems of description, discovery, data mediation and dynamic configuration in the context of both SOAP and RESTful services. This work builds on past research in the areas of Semantic Web, Semantic Web services and Service Oriented Architectures. In addition to addressing these problems, this dissertation also extends the principles of services computing to the emerging area of social and human computation. The core contributions of this work include a mechanism to add semantic metadata to RESTful services and resources on the Web, an algorithm for service discovery and ranking, techniques for aiding data mediation and dynamic configuration. This work also addresses the problem of identifying events during service execution, and data integration in the context of socially powered services.
Advisors/Committee Members: Sheth, Amit.
Subjects: Computer science
Keywords: Web 2.0; Service Oriented Architecture; Semantic Web; Semantic Web Services; REST
More Like This

12.
Herner, Alan Eugene.
Measuring Uncertainty of Protein Secondary Structure.
Degree: PhD, Computer Science and Engineering PhD, 2011, Wright State University
► This dissertation develops and demonstrates a method to measure the uncertainty of…
(more)
▼ This dissertation develops and demonstrates a method to measure the uncertainty of secondary structure of protein sequences using Shannon's information theory. This method is applied to a newly developed large dataset of chameleon sequences and to several protein hinges culled from the Hinge Atlas. The uncertainty of the central residue in each tripeptide is computed for each amino acid in a sequence using Cuff and Barton's CB513 as the reference set. It is shown that while secondary structure uncertainty is relatively high in chameleon regions [avg = 1.27 bits] it is relatively low in the regions 1-7 residues nearest a chameleon [N terminus flank avg = 1.12 bits; C terminus flank avg = 1.16 bits]. This difference is shown to be highly statistically significant [ p = 9.6E-18 and p = 2.9E-12, respectively]. It is also shown that the secondary structure uncertainty of hinge regions was not found to be different to a statistically significant degree once a Bonferroni multiple test correction was applied. A new hand curated database of long “chameleon” sequences was developed. It contains nine sequences eight residues in length and eighty-five sequences of length seven.
Advisors/Committee Members: Raymer, Michael.
Subjects: Computer Science
Keywords: proteins, secondary structure, information theory, Chou-Fasman numbers, chameleon sequences
More Like This

13.
Hoff, Carl C.
Pattern Recognition via Machine Learning with Genetic Decision-Programming.
Degree: PhD, Computer Science and Engineering PhD, 2005, Wright State University
► In the intersection of pattern recognition, machine learning, and evolutionary computation is…
(more)
▼ In the intersection of pattern recognition, machine learning, and evolutionary computation is a new search technique by which computers might program themselves. That technique is called genetic decision-programming. A computer can gain the ability to distinguish among the things that it needs to recognize by using genetic decision-programming for pattern discovery and concept learning. Those patterns and concepts can be easily encoded in the spines of a decision program (tree or diagram). A spine consists of two parts: (1) the test-outcome pairs along a path from the program's root to any of its leaves and (2) the conclusion in that leaf. The test-outcome pairs specify a pattern and the conclusion identifies the corresponding concept. Genetic decision-programming combines and extends discrete decision theory with the principles of genetics and natural selection. The resulting algorithm searches for those decision programs that best satisfy some user-defined criteria. Each program mates problem decompositions with subproblem solutions, and consists of overlapping spines. Those spines are manipulated by three context-sensitive operators. The context defines a subproblem and is determined by the operator's point of application within a decision program. Macro-mutation generatyes a new solution for that context; mini-mutation restructires the existing solution for that context; and spine crossover inserts another program's solution for that context. Those solutions are encoded in the spines. Thus the operators recompose, restructure and recombine spines as the search technique evolves a population of decision programs to satisfy the user-defined criteria. Genetic decision-programming overcomes the difficulties encountered when evolving decision programs with genetic programming techniques that rely on subtree crossover. Those impractical techniques require too much memory and computation. Subtree crossover exchanges random subtrees of broken spines without regard for context. Meaning is lost. In contrast, the spine crossover of genetic decision-programming crosses entire spines and uses them in context. Meaning is retained. This means that genetic decision-programming can be applied to practical problems. In an experiment, it consistently gave very good results without the variability from problem to problem of other more conventional decision-tree construction techniques.
Advisors/Committee Members: Rizki, Mateen M.
Subjects: Computer Science
Keywords: Pattern Recognition; Machine Learning; Evolutionary Computation; Genetic Algorithm; Genetic Programming; Genetic Decision-Programming
More Like This

14.
Immaneni, Trivikram.
A HYBRID APPROACH TO RETRIEVING WEB DOCUMENTS AND SEMANTIC WEB DATA.
Degree: PhD, Computer Science and Engineering PhD, 2008, Wright State University
► The Semantic Web has been evolving into a property-linked web of RDF…
(more)
▼ The Semantic Web has been evolving into a property-linked web of RDF data, conceptually divorced from (but physically housed in) the World Wide Web of hyperlinked documents. Data Retrieval techniques are typically used to retrieve data from the Semantic Web while Information Retrieval techniques are used to retrieve documents from the Hypertext Web. Conceptually unifying the two webs enables the exploitation of their interconnections resulting in benefits to both data and document retrieval. Towards this end, we present the Unified Web model that integrates the two webs and formalizes the structure and the semantics of their interconnections. We present a hybrid approach to retrieving data and documents that is enabled by this unification. We specify the Hybrid Query Language that embodies the approach and the system SITAR that implements it.
Advisors/Committee Members: Thirunarayan, Krishnaprasad.
Subjects: Computer Science
Keywords: Semantic Web; Information Retrieval; Hybrid Query Language; Document Retrieval; Data Retrieval
More Like This

15.
Jain, Prateek.
Linked Open Data Alignment & Querying.
Degree: PhD, Computer Science and Engineering PhD, 2012, Wright State University
► The recent emergence of the “Linked Data” approach for publishing data represents…
(more)
▼ The recent emergence of the “Linked Data” approach for publishing data represents a major step forward in realizing the original vision of a web that can "understand and satisfy the requests of people and machines to use the web content" i.e. the Semantic Web. This new approach has resulted in the Linked Open Data (LOD) Cloud, which includes more than 295 large datasets contributed by experts belonging to diverse communities such as geography, entertainment, and life sciences. However, the current interlinks between datasets in the LOD Cloud, as we will illustrate,are too shallow to realize much of the benefits promised. If this limitation is left unaddressed, then the LOD Cloud will merely be more data that suffers from the same kinds of problems, which plague the Web of Documents, and hence the vision of the Semantic Web will fall short. This thesis presents a comprehensive solution to address the issue of alignment and relationship identification using a bootstrapping based approach. By alignment we mean the process of determining correspondences between classes and properties of ontologies. We identify subsumption, equivalence and part-of relationship between classes. The work identifies part-of relationship between instances. Between properties we will establish subsumption and equivalence relationship. By bootstrapping we mean the process of being able to utilize the information which is contained within the datasets for improving the data within them. The work showcases use of bootstrapping based methods to identify and create richer relationships between LOD datasets. The BLOOMS project (http://wiki.knoesis.org/index.php/BLOOMS) and the PLATO project, both built as part of this research, have provided evidence to the feasibility and the applicability of the solution.
Advisors/Committee Members: Sheth, Amit.
Subjects: Computer Science
Keywords: LINKED OPEN DATA; SEMANTIC WEB; RELATIONSHIP IDENTIFICATION; WEB OF DATA; LOD; SCHEMA MATCHING; FEDERATED QUERYING; COMPUTER SCIENCE
More Like This

16.
Jiang, Chunyu.
DATA MINING AND ANALYSIS ON MULTIPLE TIME SERIES OBJECT DATA.
Degree: PhD, Computer Science and Engineering PhD, 2007, Wright State University
► Huge amount of data is available in our society and the need…
(more)
▼ Huge amount of data is available in our society and the need for turning such data into useful information and knowledge is urgent. Data mining is an important field addressing that need and significant progress has been achieved in the last decade. In several important application areas, data arises in the format of Multiple Time Series Object (MTSO) data, where each data object is an array of time series over a large set of features and each has an associated class or state. Very little research has been conducted towards this kind of data. Examples include computational toxicology, where each data object consists of a set of time series over thousands of genes, and operational stress management, where each data object consists of a set of time series over different measuring points on the human body. The purpose of this dissertation is to conduct a systematic data mining study over microarray time series data, with applications on computational toxicology. More specifically, we aim to consider several issues: feature selection algorithms for different classification cases, gene markers or feature set selection for toxic chemical exposure detection, toxic chemical exposure time prediction, wildness concept development and applications, and organizing diversified and parsimonious committee. We will formalize and analyze these research problems, design algorithms to address these problems, and perform experimental evaluations of the proposed algorithms. All these studies are based on microarray time series data set provided by Dr. McDougal.
Advisors/Committee Members: Dong, Guozhu.
Subjects: Computer Science
Keywords: genes; MTSO; RANKING; microarray; classification; MTSO Data; TIME SERIES
More Like This

17.
Kannavara, Raghudeep.
DESIGN AND PERFORMANCE ANALYSIS OF A SECURE PROCES-SOR SCAN-SP WITH CRYPTO-BIOMETRIC CAPABILITIES.
Degree: PhD, Computer Science and Engineering PhD, 2009, Wright State University
► Secure computing is gaining importance in recent times as computing capability is…
(more)
▼ Secure computing is gaining importance in recent times as computing capability is increasingly becoming distributed and information is everywhere. Prevention of piracy and digital rights management has become very important. Information security is mandatory rather than an additional feature. Numerous software techniques have been proposed to provide certain level of copyright and intellectual property protection. Techniques like obfuscation attempt to transform the code into a form that is harder to reverse engineer. Tamper-proofing causes a program to malfunction when it detects that it has been modified. Software watermarking embeds copyright notice in the software code to allow the owners of the software to assert their intellectual property rights. The software techniques discourage software theft, can trace piracy, prove ownership, but cannot prevent copying itself. Thus, software based security firewalls and encryption is not completely safe from determined hackers. This necessitates the need for information security at the hardware level, where secure processors assume importance. In this dissertation, a detailed architecture and instruction set of the SCAN-Secure Processor is proposed. The SCAN-SP is a modified SparcV8 processor architecture with a new instruction set to handle image compression, encryption, information hiding based on SCAN methodology and biometric authentication based on Local Global Graph methodology. A SCAN based methodology for encryption and decryption of 32 bit instructions and data is proposed. The modules to support the new instructions are synthesized in reconfigurable logic and the results of FPGA synthesis are presented. The ultimate goal of the proposed work is a detailed study of the tradeoffs that exists between speed of execution and security of the processor. Designing a faster processor is not the goal of the proposed work, rather exploring the architecture to provide security is of prime importance.
Advisors/Committee Members: Bourbakis, Nikolaos.
Subjects: Computer science; Electrical engineering; Engineering
Keywords: Secure Processor, Cryptography, Steganography, Biometrics, SCAN methodology, Local-Global graphs
More Like This

18.
Karargyris, Alexandros.
A Novel Synergistic Diagnosis Methodology for identifying Abnormalities in Wireless Capsule Endoscopy videos.
Degree: PhD, Computer Science and Engineering PhD, 2010, Wright State University
► Wireless Capsule Endoscopy (WCE) is a new technology that allows medical personnel…
(more)
▼ Wireless Capsule Endoscopy (WCE) is a new technology that allows medical personnel to view the gastrointestinal (GI) mucosa. It is a swallowable miniature capsule device the size of a pill that transmits thousands of screenshots of the digestive tract to a wearable receiver. When the procedure finishes the video is uploaded to a workstation for viewing. Capsule Endoscopy has been established as a tool to identify various gastrointestinal (GI) conditions, such as blood-based abnormalities, polyps, ulcers, Crohn's disease in the small intestine, where the classical endoscopy is not regularly used. As of 2009 the market is dominated by Given Imaging Inc. capsule (PillCam SB). More than 300,000 capsules have been sold since 2001 when it was first introduced. The company provides a software package (RAPID) to view the WCE video, offering a bleeding detector feature based on red color. It provides a position estimator of the capsule inside the digestive tract. Additionally its multi-view feature gives a simultaneous view of two or four consecutive video frames in multiple windows. Finally a library of reference images (RAPID Atlas) is provided so that the user can have easy access to on-screen case images. Although the company's software is a useful tool, the viewing of a WCE video is still a time consuming process (~ 2 hours), even for experienced gastroenterologists. In addition, the company's software has serious limitations (35% bleeding detection) and no capability of detecting polyps or ulcers according to gastroenterologists. Therefore, the need for computer aided model-methodology with robust detection performance on various conditions (blood, polyps, ulcers, etc) is clearly obvious. Thus, our research studies have been successfully carried out on: a) the automatic detection of malignant intestinal features like polyps, bleeding, and abnormal regions (tumors); b) finding the boundaries of the digestive organs; and c) reducing the viewing-examination time with a robust registra-tion methodology. These studies have led to the development of the ATRC Video Toolbox (ATRC-VT). ATRC-VT incorporates signal processing methods, color and image processing techniques, and artificial intelligence tools to detect blood-based abnormalities, polyps and ulcers in the small intestine. It is the first computer aided detection (CAD) software with multiple capabilities for WCE videos designed with a Graphics User Interface so that it is easy to use.
Advisors/Committee Members: Bourbakis, Nikolaos.
Subjects: Computer science
Keywords: capsule; endoscopy; imaging; medical; computer; graphics; computer vision; locomotion; bowel; digestive tract; gastroenterology
More Like This

19.
Keefer, Robert B.
A User Centered Design and Prototype of a Mobile Reading Device for the Visually Impaired.
Degree: PhD, Computer Science and Engineering PhD, 2011, Wright State University
► While mobile reading devices have been on the market and investigated by…
(more)
▼ While mobile reading devices have been on the market and investigated by researchers in recent years, there is still work required to make these devices highly accessible to the visually impaired. A usability test with one such device revealed gaps in the current state of the art devices. These gaps focus mostly on the user interaction and his or her ability to quickly consume written reading material. In this dissertation a voice user interface (VUI) is presented that improves the ability of a blind user of a mobile reading device to interact with written material. The image processing techniques required to facilitate this interaction with a document image are also presented. Contributions of this research include a model of the VUI, which was validated by user tests of a prototype with visually impaired participants, a document image perspective correction technique, a document image dewarping technique, and a headline identification technique, among others. Three separate user tests with visually impaired participants were used to guide and validate the interaction research. The document image processing techniques, combined to form a document image processing pipeline, were tested on 25 document images. Comparative results from the user tests and processing of the document images are presented in this dissertation.
Advisors/Committee Members: Bourbakis, Nicholaos.
Subjects: Computer Science
Keywords: Document Image Processing; Document Image Layout Analysis; Voice User Interface; User Interface Modeling
More Like This

20.
Koehler, Christopher M.
Visualization of Complex Unsteady 3D Flow: Flowing Seed Points and Dynamically Evolving Seed Curves with Applications to Vortex Visualization in CFD Simulations of Ultra Low Reynolds Number Insect Flight.
Degree: PhD, Computer Science and Engineering PhD, 2010, Wright State University
► Three dimensional integration-based geometric visualization is a very powerful tool for analyzing…
(more)
▼ Three dimensional integration-based geometric visualization is a very powerful tool for analyzing flow phenomena in time dependent vector fields. Streamlines in particular have many perceptual benefits due to their ability to provide a snapshot of the vectors near key features of complex 3D flows at any instant in time. However, streamlines do not lend themselves well to animation. Subtle changes in the vector field at each time step lead to increasingly large changes between streamlines with the same seed point the longer they are integrated. Path lines, which show particle trajectories over time suffer from similar problems when attempting to animate them. Dynamic deformable objects in the flow domain also complicate the use of integration-based visualization. Current methods such as streamlines, path lines, streak lines, particle advection and their many conceptual and higher dimensional variants produce undesirable results for this kind of data when the most important flow phenomena occurs near and moves with the objects. In this work I present methods to handle both of these problems. First, the flowing seed point algorithm is introduced, which visually captures the perceptual benefits of smoothly animated streamlines and path lines by generating a series of seed points that travel through space and time on streak lines and timelines. Next, a novel dynamic seeding strategy for both streamlines and generalized streak lines is introduced to handle deformable moving objects in the flow domain in situations where static seeding objects fail for most time steps. These two methods are then combined in order to visualize the instantaneous direction and orientation of a flow which results from flapping objects in a fluid. Initial tests are performed with a single rigid flapping disk. Further tests were performed on a more complex biologically inspired CFD simulation of the deformable flapping wings of a dragonfly as it takes off and begins to maneuver. For this test, seeds are automatically chosen such that the formation, evolution and breakdown of the leading edge vortex is highlighted as well as the wing wake interactions that occur between the forewings and hind wings.
Advisors/Committee Members: Wischgoll, Thomas.
Subjects: Computer Engineering; Computer Science
Keywords: flow visualization, insect flight, streamlines, streak lines, dynamic seeds, flowing seeds
More Like This

21.
Kramer, Gregory Robert.
An analysis of neutral drift's effect on the evolution of a CTRNN locomotion controller with noisy fitness evaluation.
Degree: PhD, Computer Science and Engineering PhD, 2007, Wright State University
► This dissertation focuses on the evolution of Continuous Time Recurrent Neural Networks…
(more)
▼ This dissertation focuses on the evolution of Continuous Time Recurrent Neural Networks (CTRNNs) as controllers for control systems. Existing research suggests that the process of neutral drift can greatly benefit evolution for problems whose fitness landscapes contain large-scale neutral networks. CTRNNs are known to be highly degenerate, providing a possible source of large-scale landscape neutrality, and existing research suggests that neutral drift benefits the evolution of simple CTRNNs. However, there has been no in-depth examination of the effects of neutral drift on complex CTRNN controllers, especially in the presence of noisy fitness evaluation. To address this problem, this dissertation presents an analysis of the effect of neutral drift on the evolution of a complex CTRNN locomotion controller for a simulated hexapod robot in the presence of noisy fitness evaluations. In particular, two stochastic hill-climber-based EAs are examined and compared, one that does not engage in neutral drift, and one that does. The experimental results show that while neutral drift provides a significant advantage early in the evolutionary process, the later effects of noisy fitness evaluations seriously degrades the utility of neutral drift, and overall, there is no significant difference between the non-drifting and drifting EAs. These results provide evidence that large-scale neutral networks do exist in complex CTRNN fitness landscapes and highlight the important role that noisy fitness evaluations play in influencing evolutionary performance.
Advisors/Committee Members: Gallagher, John C.
Subjects: Computer Science
Keywords: Continuous Time Recurrent Neural Network; Neutrality; Neutral Drift; Artificial Evolution; Evolutionary Computation; Neural Network; Controller; Evolutionary Algorithm; Noisy Fitness Evaluation
More Like This

22.
Li, Tianjian.
On Optimal Survivability Design in WDM Optical Networks under Scheduled Traffic Models.
Degree: PhD, Computer Science and Engineering PhD, 2007, Wright State University
► Wavelength division multiplexing (WDM) optical networks are widely viewed as the most…
(more)
▼ Wavelength division multiplexing (WDM) optical networks are widely viewed as the most appropriate choice for future Internet backbone with the potential to fulfill the ever-growing demands for bandwidth. WDM divides the enormous bandwidth of an optical fiber into many non-overlapping wavelength channels, each of which may operate at the rate of 10 Gigabit per second or higher. A failure in a network such as a cable cut may result in a tremendous loss of data. Therefore, survivability is a very important issue in WDM optical networks. The objective of this dissertation is to address the survivability provisioning problem in WDM optical networks under a scheduled traffic model and a sliding scheduled traffic model that we propose. In contrast to the conventional traffic models considered in communication networks such as static traffic model and dynamic random traffic model, the scheduled traffic model and the sliding scheduled traffic model are able to capture the traffic characteristics of applications that require capacity on a time-limited basis. They also give service providers more flexibility in provisioning the requested demands and a better opportunity to optimize the network resources. The survivability provisioning problem is to determine a pair of link-disjoint paths under the link failure model or a pair of SRLG-disjoint paths under the Shared Risk Link Group (SRLG) failure model, one working path and one protection path, for each demand in a given set of traffic demands with the objective of minimizing the total resources used by all traffic demands while 100% restorability is guaranteed against any single failure. To provision survivable service under the scheduled traffic model, we develop two sets of integer linear program (ILP) formulations for joint and non-joint optimizations using different protection schemes such as dedicated and shared path based protections. We also design a capacity provision matrix based Iterative Survivable Routing (ISR) algorithm with different demand scheduling policies to solve the survivable routing and wavelength assignment (RWA) problem. In addition, we extend the heuristic algorithm design from dealing with single link failure to single SRLG failure. The issue of survivability provisioning in WDM optical networks under the sliding scheduled traffic model has never been addressed by the research community. In the dissertation, we carry out the following tasks under this traffic model: (a) development of RWA ILP optimization formulations for dedicated and shared path based protection; (b) design and implementation of efficient heuristic algorithms for shared path based protection. Specifically, in the proposed heuristic algorithm, we introduce a demand time conflict reduction algorithm to minimize the time overlapping among a set of demands by properly placing a demand within its associated time window; and (c) extending the heuristic algorithm design under the single link failure model to the single SRLG failure model.
Advisors/Committee Members: Wang, Bin.
Keywords: Traffic; SRLG; path and one protection; Traffic Models; working path; number of wavelength-links
More Like This

23.
Li, Yanjun.
High Performance Text Document Clustering.
Degree: PhD, Computer Science and Engineering PhD, 2007, Wright State University
► Data mining, also known as knowledge discovery in database (KDD), is the…
(more)
▼ Data mining, also known as knowledge discovery in database (KDD), is the process to discover interesting unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract interesting and nontrivial information and knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. This research focuses on improving the performance of text clustering. We investigated the text clustering algorithms in four aspects: document representation, documents closeness measurement, high dimension reduction and parallelization. We propose a group of high performance text clustering algorithms, which target the unique characteristics of unstructured text database. First, two new text clustering algorithms are proposed. Unlike the vector space model, which treats document as a bag of words, we use a document representation which keeps the sequential relationship between words in the documents. In these two algorithms, the dimension of the database is reduced by considering the frequent word (meaning) sequences, and the closeness of two documents is measured based on the sharing of frequent word (meaning) sequences. Second, a text clustering algorithm with feature selection is proposed. This algorithm gradually reduces the high dimension of database by performing feature selection during the clustering. The new feature selection method applied is based on the well-known chi-square statistic and a new statistical data which can measure the positive and negative term-category dependence. Third, a group of new text clustering algorithms is developed based on the k-means algorithm. Instead of using the cosine function, a new function involving global information is proposed to measure the closeness between two documents. This new function utilizes the neighbor matrix introduced in [Guha:2000]. A new method for selecting initial centroids and a new heuristic function for selecting a cluster to split are adopted in the proposed algorithms. Last, a new parallel algorithm for bisecting k-means is proposed for the message-passing multiprocessor systems. This new algorithm, named PBKP, fully utilizes the data-parallelism of the bisecting k-means algorithm, and adopts a prediction step to balance the workloads of multiple processors to achieve a high speedup. Comprehensive performance studies were conducted on all the proposed algorithms. In order to evaluate the performance of these algorithms, we compared them with existing text clustering algorithms, such as k-means, bisecting k-means [Steinbach:2000] and FIHC [Fung:2003]. The experimental results show that our clustering algorithms are scalable and have much better clustering accuracy than existing algorithms. For the parallel PBKP algorithm, we tested it on a 9-node Linux cluster system and analyzed its performance. The experimental results suggest that the speedup of PBKP is linear with the number of processors and data points. Moreover, PBKP scales up better than the parallel k-means with respect to the desired number of clusters.
Advisors/Committee Members: Chung, Soon M.
Keywords: Document Clustering, Text Mining, K-means, Bisecting K-means, Algorithm, Performance Analysis.
More Like This

24.
Luo, Xubin.
Wavelength Division Multiplexing Optical Networks for Supporting Grid Computing.
Degree: PhD, Computer Science and Engineering PhD, 2008, Wright State University
► Grid computing is a computing model in which various resources, such as…
(more)
▼ Grid computing is a computing model in which various resources, such as processing power, storage systems, data sources, or instruments, are interconnected by a communication infrastructure and accessible as a public utility for solving large scale resource intensive problems. Grid computing requires dynamic communication between distributed resources and high bandwidth quality assured survivable communication services. Optical fiber communication and networking with wavelength division multiplexing (WDM) provides a promising means to support the communication needs of grid computing, by offering huge capacity, relatively low latency, as well as dynamic control and allocation of bandwidth at various granularities. In this dissertation, we explore several important and challenging issues in optical WDM networks for supporting grid computing:Diverse Routing for Survivable Service: Grid computing requires survivable communication services due to the huge traffic volume involved. We study the diverse routing problem in networks with a shared risk link group (SRLGs) failure. An SRLG includes a set of optical links that are affected by a single failure in the physical layer of a WDM network. The objective is to determine a pair of SRLG-disjoint paths with the minimum total cost. Quality of Service Routing: To ensure the quality of service for grid communication, connections with bounded risk of failure and transmission delay are highly desirable. We study the multi-constrained routing problem that finds a path that guarantees the end-to-end failure risk bound in terms of total weight of SRLGs while minimizing the end-to-end transmission delay bound, or the path cost. Scheduled Service Provisioning using Light-trails: The recently proposed light-trail concept is a promising technique in WDM optical networks for supporting the grid computing traffic given light-trails' dynamic provisioning ability. We model the grid communication demands as a scheduled traffic model, in which a set of demands is given and the setup time, teardown time, and the requested bandwidth of a demand are known in advance. We consider efficient provisioning scheduled services for supporting grid communication using static and dynamic light-trails. Cost Effective Services with Waveband Switching and Light-trails: Waveband switching allows several wavelength channels to be batch switched and therefore reduces the number of ports needed in the switches, so that the complexity and cost of switching nodes can be significantly reduced. We study the waveband switching technology in conjunction with the light-trail architecture for supporting grid computing. This study provides us with critical insights into the designing problem of applying waveband switching technology into light-trail optical networks. Joint Scheduling of Grid Tasks and Communication: Finally, we propose to tackle the service provisioning problem in light-trail WDM networks that jointly considers the grid task scheduling problem for supporting the grid computing. A flexible task model (FTM) is presented for modeling the grid tasks. This model is more general and flexible than the conventional task graph model considered in previous work. Simulation results show that our scheduling algorithm with FTM significantly reduces the total task completion time, compared with the results obtained using the conventional method. Note: This research is partially funded by DAGSI and DOE.
Advisors/Committee Members: Wang, Bin.
Subjects: Computer science
Keywords: WDM, lightpath, light-trail, grid computing
More Like This

25.
Maarouf, Marwan Younes.
XML Integrated Environment For Service-Oriented Data Management.
Degree: PhD, Computer Science and Engineering PhD, 2007, Wright State University
► The proliferation of XML as a family of related standards including a…
(more)
▼ The proliferation of XML as a family of related standards including a markup language (XML), formatting semantics (XSL style sheets), a linking syntax (XLINK), and appropriate data schema standards have emerged as a de facto standard for encoding and sharing data between various applications. XML is designed to be simple, easily parsed and self-describing. XML is based on and support the idea of separation of concerns: information content is separated from information rendering, and relationships between data elements are provided via simple nesting and references. As the XML content grows, the ability to handle schemaless XML documents becomes more critical as most XML documents do not have schema or Document Type Definitions (DTDs). In addition, XML content and XML tools are often required to be combined in effective ways for better performance and higher flexibility. In this research, we proposed XML Integrated Environment (XIE) which is a general-purpose service-oriented architecture for processing XML documents in a scalable and efficient fashion. The XIE supports a new software service model that provides a proper abstraction to describe a service and divide it into four components: structure, connection, interface and logic. We also proposed and implemented XIE Service Language (XIESL) that can capture the creation and maintenance of the XML processes and the data flow specified by the user and then orchestrates the interactions between different XIE services. Moreover, XIESL manages the complexity of XML processing by implementing an XML processing pipeline that enables better management, control, interpretation and presentation of the XML data even for non-professional users. The XML Integrated Environment is envisioned to revolutionize the way non-professional programmers see, work and manage their XML assets. It offers them powerful tools and constructs to fully utilize the XML processing power embedded in its unified framework and service-oriented architecture.
Advisors/Committee Members: Chung, Soon M.
Subjects: Computer Science
Keywords: Schemaless XML documents, XML Integrated Environment (XIE), XIE Service Language (XIESL), XML processing pipeline, service-oriented computing (SOC).
More Like This

26.
Mao, Shihong.
Comparative Microarray Data Mining.
Degree: PhD, Computer Science and Engineering PhD, 2007, Wright State University
► As a revolutionary technology, microarrays have great potential to provide genome-wide patterns…
(more)
▼ As a revolutionary technology, microarrays have great potential to provide genome-wide patterns of gene expression, to make accurate medical diagnosis, and to explore genetic causes underlying diseases. It is commonly believed that suitable analysis of microarray datasets can lead to achieve the above goals. While much has been done in microarray data mining, few previous studies, if any, focused on multiple datasets at the comparative level. This dissertation aims to fill this gap by developing tools and methods for set-based comparative microarray data mining. Specifically, we mine highly differentiative gene groups (HDGGs) from given datasets/classes, evaluate the concordance of datasets generated from different platforms/laboratories, investigate the impact of variability in microarray dataset on data mining results, provide tools and algorithms for the above tasks, and identify reliable invariant HDGG patterns for better understanding diseases. It is a big challenge to discover high-quality discriminating (emerging) patterns from high dimensional microarray datasets. We develop a novel feature-group selection method to help discover HDGGs, especially signature HDGGs that completely characterize some disease classes. In addition to giving insights on the diseases, better classification results are also obtained using HDGG-based classifiers compared with other existing classifiers. As microarray datasets are often generated from different platforms/laboratories, it is necessary to evaluate their concordance/consistence before they can be studied together. We provide measures and techniques to quantitatively test such concordance at the comparative level. In addition to applying measures to evaluate the degree of variability in microarray datasets, we also develop a novel algorithm called C-loocv to effectively minimize the variability. As an indicator of the utility of C-loocv, classifiers trained from C-loocv-refined datasets become more robust and predict test samples at significantly higher accuracy over classifiers trained from original datasets. Based on the variability minimization algorithm, we provide a novel strategy to mine invariant patterns from multiple datasets concerning a common disease. As a demonstration, invariant patterns are identified from two datasets concerning lung cancer; these patterns may shed light on the mechanism underlying the pathogenesis of lung cancer. Our methods are generic and can be applied to microarrays concerning any human diseases.
Advisors/Committee Members: Dong, Guozhu.
Subjects: Computer Science
Keywords: data mining, microarray data, comparative
More Like This

27.
Muppavarapu, Vineela.
Semantic and Role-Based Access Control for Data Grid Systems.
Degree: PhD, Computer Science and Engineering PhD, 2009, Wright State University
► A Grid is an integration infrastructure for sharing and coordinated use of…
(more)
▼ A Grid is an integration infrastructure for sharing and coordinated use of diverseresources in dynamic, distributed virtual organizations (VOs). A Data Grid is an architecture for the access, exchange, and sharing of data in the Grid environment. Distributed data resources can be diverse in their formats, schema, quality, access mechanisms, ownership, access policies, and capabilities. In recent years, several organizations have started utilizing Grid technologies to deploy data-intensive and/or computation-intensive applications. As more and more organizations are sharing data resources and participating in Data Grids, the complexity and heterogeneity of the systems is increasing constantly, but their management techniques are not evolving making the systems more complicated and error-prone, indicating a clear need for standardized mechanisms to manage access control for the shared data resources. The Open Grid Services Architecture - Data Access and Integration (OGSA-DAI) and the Storage Resource Broker (SRB) are widely used frameworks for the integration of heterogeneous data resources in Data Grid systems. However, in these systems, access control causes substantial administration overhead for the resource providers because the authorization information has to be maintained for individual Grid users. In addition, access control policies need to specified and managed across multiple organizations. And, each organization in a Data Grid may use its own terminology to describe a resource making it difficult to coordinate between the organizations. This dissertation focuses on solving these problems and provides access control systems that are based on existing standards. We developed a role-based access control (RBAC) system with Shibboleth, which is an attribute authorization service currently being used in many Grid applications. We used the Core and Hierarchical RBAC profile of the eXtensible Access Control Markup Language (XACML) standard for specifying access control policies uniformly across different organizations. For distributed administration of those policies, we used the Object, Metadata and Artifacts Registry (OMAR). OMAR is based on the e-business eXtensible Markup Language (ebXML) registry specifications developed to achieve interoperable registries and repositories. We developed a semantic-based access control method using the ontology to resolve the semantic differences in terminologies. Understanding the semantics of the data being protected is often helpful in determining which users can access the data and what access level the users can have. Web Ontology Language (OWL) is used to represent the ontology of the data resources and users. By using ontology, VOs can resolve the differences in their terminologies and specify access control policies based on concepts and user roles, instead of individual data resources and user identities. Administration of XACML policies is a difficult task because each XACML policy has several components, and the number of XACML policies may be very large in a Data Grid environment. However, no efficient tool is available for the creation and update of XACML policies. So, we developed an XACML administration tool and a GUI in Java. The tool allows the creation of XACML policies from existing RBAC policies. The tool also provides capabilities to update or create new RBAC policies. Using this tool, the policy administrator can create new users, roles, data resources, and actions. It allows the administrator to change the user-role assignment and the permissions on a role. Our proposed access control systems allow quick and easy deployments, and privacy protection. The systems are scalable, and support interoperability and fine-grain access control. Administration overheads for the resource providers are reduced because they do not need to maintain the individual user information. Moreover, our system allows unauthorized requests to be denied before establishing a connection to the resource, thereby reducing the connection overheads and making the data resources to be available to authorized users. Performance analysis shows that our systems add very little overhead to the existing security infrastructures of SRB and OGSA-DAI.
Advisors/Committee Members: Chung, Soon M.
Subjects: Computer science
Keywords: Data Grid; role-based access control; semantic-based access control; Open Grid Services Architecture-Data Access and Integration (OGSA-DAI); Storage Resource Broker; Shibboleth; eXtensible Access Control Markup Language (XACML); ontology
More Like This

28.
Nagarajan, Bala Meenakshi.
Understanding User-Generated Content on Social Media.
Degree: PhD, Computer Science and Engineering PhD, 2010, Wright State University
► Over the last few years, there has been a growing public and…
(more)
▼ Over the last few years, there has been a growing public and enterprise fascination with ‘social media' and its role in modern society. At the heart of this fascination is the ability for users to participate, collaborate, consume, create and share content via a variety of platforms such as blogs, micro-blogs, email, instant messaging services, social network services, collaborative wikis, social bookmarking sites, and multimedia sharing sites. This dissertation is devoted to understanding informal user-generated textual content on social media platforms and using the results of the analysis to build Social Intelligence Applications. The body of research presented in this thesis focuses on understanding what a piece of user-generated content is ‘About' via two sub-goals of Named Entity Recognition and Key Phrase Extraction on informal text. In light of the poor context and informal nature of content on social media platforms, we investigate the role of contextual information from documents, domain models and the social medium to supplement and improve the reliability and performance of existing text mining algorithms for Named Entity Recognition and Key Phrase Extraction. In all cases we find that using multiple contextual cues together lends to reliable inter-dependent decisions, better than using the cues in isolation and that such improvements are robust across domains and content of varying characteristics, from micro-blogs like Twitter, social networking forums such as those on MySpace and Facebook, and blogs on the Web. Finally, we showcase two deployed Social Intelligence applications that build over the results of Named Entity Recognition and Key Phrase Extraction algorithms to provide near real-time information about the pulse of an online populace. Specifically, we describe what it takes to build applications that wish to exploit the ‘wisdom of the crowds'- highlighting challenges in data collection, processing informal English text, metadata extraction and presentation of the resulting information.
Advisors/Committee Members: Sheth, Amit.
Subjects: Computer science
Keywords: social media, user-generated content, informal text analysis, domain knowledge
More Like This

29.
Pantelopoulos, Alexandros A.
PROGNOSIS: A WEARABLE SYSTEM FOR HEALTH MONITORING OF PEOPLE AT RISK.
Degree: PhD, Computer Science and Engineering PhD, 2010, Wright State University
► Wearable Health Monitoring Systems (WHMS) have drawn a lot of attention from…
(more)
▼ Wearable Health Monitoring Systems (WHMS) have drawn a lot of attention from the research community and the industry during the last decade. The development of such systems has been motivated mainly by increasing healthcare costs and by the fact that the world population is ageing. In addition to that, RandD in WHMS has been propelled by recent technological advances in miniature bio-sensing devices, smart textiles, microelectronics and wireless communications techniques. These portable health systems can comprise various types of small physiological sensors, which enable continuous monitoring of a variety of human vital signs and other physiological parameters such as heart rate, respiration rate, body temperature, blood pressure, perspiration, oxygen saturation, electrocardiogram (ECG), body posture and activity etc. As a result, and also due to their embedded transmission modules and processing capabilities, wearable health monitoring systems can constitute low-cost and unobtrusive solutions for ubiquitous health, mental and activity status monitoring. The majority of the currently developed WHMS research prototypes and products provide the basic functionality of continuously logging and transmitting physiological data. However, WHMS have the potential of achieving early detection and diagnosis of critical health changes that could enable prevention of health hazardous episodes. To do that, they should be able to learn individual user baselines and also employ advanced information processing algorithms and diagnostics in order to discover problems autonomously and detect alarming health trends, and consequently, inform medical professionals for further assistance. In an effort to advance the capabilities of a wearable system towards these goals, we focus in this dissertation on the development of a novel WHMS, called Prognosis. The developed prototype platform includes the following innovative features, which constitute the main research contributions of this work: a) a novel and highly accurate methodology for classifying ECG recordings on a resource constrained device which is based on the Matching Pursuits algorithm and a Neural Network, b) a physiological data fusion scheme based on a fuzzy regular formal language model, whereby the current state of the corresponding fuzzy Finite State Machine signifies the current health state and context of the patient, c) the extension of the decision making methodology based on a modified Fuzzy Petri Net (FPN) model, d) the integration of a user-learning strategy based on a neural-fuzzy extension of the FPN, e) the incorporation of a system-patient dialogue interaction in order to capture non-measurable patient symptoms such as chest pain, dizziness, malaise etc and finally f) the prototyping of the system based on a smart-phone that runs multi-threaded J2ME software for handling multiple simultaneous Bluetooth connections with off-the-shelf wireless bio-sensors.
Advisors/Committee Members: Bourbakis, Nikolaos.
Subjects: Computer science; Engineering; Health care; Information Systems
Keywords: wearable health monitoring system, ECG classification, ECG denoising, medical decision support system, smart-phone
More Like This

30.
Park, Sang Mork.
PRIVACY-PRESERVING ATTRIBUTE-BASED ACCESS CONTROL IN A GRID.
Degree: PhD, Computer Science and Engineering PhD, 2010, Wright State University
► A Grid community is composed of diverse stake holders, such as data…
(more)
▼ A Grid community is composed of diverse stake holders, such as data resource providers, computing resource providers, service providers, and the users of the resources and services. In traditional security systems for Grids, most of the authentication and authorization mechanisms are based on the user's identity or the user's classification information. If the authorization mechanism is based on the user's identity, fine-grained access control policies can be implemented but the scalability of the security system would be limited. If the authorization mechanism is based on the user's classification, the scalability can be improved but the fine-grained access control policies may not be supported. We developed an enhanced version of the Community Authorization Service (CAS) which supports centralized, fine-grained access control by managing the memberships, service types, resource objects and security policies of a Virtual Organization (VO). The current CAS provides fundamental solutions regarding user privacy, authentication and authorization, but it has some limitations due to its centralized management of the security policies of a VO, in terms of scalability, flexibility and interoperability. We enhanced the CAS to support diverse security requirements within a dynamic Grid environment by enabling the CAS server to publish a proxy certificate embedding additional attributes of users. It allows the service providers to support customized services by analyzing the attributes of users and security policies. Previous researches on privacy-preserving in a Grid have focused on protecting the data stored in a data server and on securing the communication to protect exchanged data. The issue of preserving the privacy of users has not been a major issue in the security domain. However, as on-line transactions prevail and diverse user attributes are required for authorization decision, the privacy-preserving becomes an important issue. Attribute-Based Access Control (ABAC) employs multiple attributes for authorization decision, which enables the security system to be flexible, interoperable, and multifunctional. However, ABAC has disadvantages with regard to privacy-preserving because it requires the circulation of the user attributes which can increase the risk of privacy violation. To enhance the privacy-preserving capability of ABAC in a Grid, we developed an attribute release control mechanism to publish an optimal set of attributes that are essential to access a desired resource (or service), while exposing least amount of sensitive user information. To facilitate the selection of an optimal set of attributes, we also developed Security Policy Publication Service (SPPS) which retrieves the access condition from the access control policies in eXtensible Access Control Markup Language (XACML) and converts it into a Disjunctive Normal Form (DNF) of attributes. We modified the Shibboleth Identity Provider and GridShib for the implementation of our privacy-preserving ABAC, and the performance analysis shows that the overhead of the proposed system is very small.
Advisors/Committee Members: Chung, Soon M.
Subjects: Computer science
Keywords: Grid; Security; Attribute-Based Access Control; Privacy; DNF conversion; CNF conversion; Globus; Shibboleth
More Like This
[1] [2]