Search Results (1 - 25 of 40 Results)

Sort By  
Sort Dir
 
Results per page  

Ma, TaoA Framework for Modeling and Capturing Social Interactions
PhD, University of Cincinnati, 2015, Engineering and Applied Science: Electrical Engineering
The understanding of human behaviors in the scope of computer vision is beneficial to many different areas. Although great achievement has been made, human behavior research investigations are still targeted on isolated, low-level, and individual activities without considering other important factors, such as human-human interactions, human-object interactions, social roles, and surrounding environments. Numerous publications focus on recognizing a small number of individual activities from body motion features with pattern recognition models, and are satisfied with small improvements of recognition rate. Furthermore, methods employed in these investigations are far from being suitable to be used in real cases considering the complexity of human society. In order to address the issue, more attention should be paid on cognition level rather than feature level. In fact, for a deeper understanding of social behavior, there is a need to study its semantic meanings against the social contexts, known as social interaction understanding. A framework for detecting social interaction needs to be established to initiate the study. In addition to individual body motions, more factors, including body motions, social roles, voice, related objects, environment, and other individuals' behaviors were added to the framework. To meet the needs, this dissertation study proposed a 4-layered hierarchical framework to mathematically model social interactions, and then explored several challenging applications based on the framework to demonstrate the great value of the study. There are no existing multimodality social interaction datasets available for this research. Thus, in Research Topic I, two typical scenes were created with a total of 24 takes (a take means a shot for a scene) as social interaction dataset. Topic II introduced a 4-layered hierarchical framework of social interactions, which contained 1) feature layer, 2) simple behavior layer, 3) behavior sequence layer, and 4) pairwise social interaction layer, from down to top. The top layer eventually generated two persons' joint behaviors in the form of descriptions with semantic meanings. To deal with the recognition within each layer, different statistical models were adopted. In Topic III, three applications based on the social interaction framework were presented, including social engagement, interesting moment, and visualization. The first application measured how strong the interaction was between an interaction pair. The second one detected unusual (interesting) individual behaviors and interactions. The third application aimed to better visually represent data so that users can get access to useful information quickly. All experiments in Research Topic II and III were based on the social interaction dataset created for the study. Performance of different layers was evaluated by comparing the experiment results with those of existing literature. The framework was demonstrated to be able to successfully capture and model certain social interactions, which can be applied to other situations. The pairwise social interaction layer generated joint behaviors with high accuracy because of the coupling nature of the model. Exploration on social engagement, interesting moments, and visualization shows great practical value of the current research may stimulate discussions and intrigue more research studies in the area.

Committee:

William Wee, Ph.D. (Committee Chair); Raj Bhatnagar, Ph.D. (Committee Member); Chia Han, Ph.D. (Committee Member); Anca Ralescu, Ph.D. (Committee Member); Xuefu Zhou, Ph.D. (Committee Member)

Subjects:

Computer Engineering

Keywords:

Human behavior understanding;Social interaction;Machine learning;Computer vision;Interesting moment;Social engagement

LI, XIAOKUNTHREE-DIMENSIONAL OBJECT RECONSTRUCTION FROM RANGE IMAGES
PhD, University of Cincinnati, 2004, Engineering : Electrical Engineering
This research work focuses on reconstructing surface models from range images of three-dimensional (3D) objects. This problem of surface reconstruction from range images is very important in the design of any 3D computer vision system and normally consists of several phases of data processing. This work presents novel algorithms to provide efficient solutions used in the following four key phases: data acquisition, data registration from multiple views, data integration, and surface reconstruction. In the phase of data acquisition, two model-based approaches, the area model (AM) and line model (LM), are proposed to model the systematical error of a given ranging system (4DI system), and error lookup table is built with these models and used to reduce the systematical error of acquired data. In the phase of data registration, a registration method involving geometric transformation is presented for reconstructing large object from multiple views with high accuracy. The algorithm provides highly precise numerical values for the elements of the transformation matrices. These values are determined by the specific system parameters estimated through carefully designed tests. In the phase of data integration, a novel data integration approach which is based on predefined criteria and the nearest neighbor searching is developed. Our method manipulates surface points directly, thus, it provides a simple and fast way for overlap removal. In the phase of surface reconstruction, an algorithm for meshing is proposed. It considers the shape change at the boundary of mesh area and forces mesh area to propagate according to a priority driven based strategy. New criteria for triangulation are developed to construct triangle at each step of mesh growing. All these approaches are successfully applied to various range data sets of objects with different geometrical shapes

Committee:

Dr. William G. Wee (Advisor)

Keywords:

Computer Vision; Data Visualization; 3D Reconstruction

Feather, Ryan K.TRACKING AND ACTIVITY ANALYSIS IN WIDE AREA AERIAL SURVEILLANCE VIDEO
Master of Science, The Ohio State University, 2011, Computer Science and Engineering
In this work, we provide tracking and activity analysis of an aerial video data set. We propose algorithms that are both scalable and able to handle the additional challenges presented by this form of data. Specifically, we present a method that consists of two main parts: track extraction and traffic activity analysis. After a preprocessing stabilization step, we use a constrained interest point matching algorithm to generate tracking data of vehicles in the scene. Finally, we present algorithms that use this data to recognize traffic activity patterns such as traffic direction, bidirectional roads, bidirectional stops, and accelerations/ decelerations via analysis of average speed patterns. We provide a thorough analysis of our results, including quantitative analysis of our activity analysis algorithms.

Committee:

James W. Davis, PhD (Advisor); Eric Fosler-Lussier, PhD (Committee Member)

Subjects:

Computer Science

Keywords:

computer vision; wide area surveillance; surveillance; activity analysis;

Karargyris, AlexandrosA Novel Synergistic Diagnosis Methodology for identifying Abnormalities in Wireless Capsule Endoscopy videos
Doctor of Philosophy (PhD), Wright State University, 2010, Computer Science and Engineering PhD

Wireless Capsule Endoscopy (WCE) is a new technology that allows medical personnel to view the gastrointestinal (GI) mucosa. It is a swallowable miniature capsule device the size of a pill that transmits thousands of screenshots of the digestive tract to a wearable receiver. When the procedure finishes the video is uploaded to a workstation for viewing. Capsule Endoscopy has been established as a tool to identify various gastrointestinal (GI) conditions, such as blood-based abnormalities, polyps, ulcers, Crohn's disease in the small intestine, where the classical endoscopy is not regularly used.

As of 2009 the market is dominated by Given Imaging Inc. capsule (PillCam SB). More than 300,000 capsules have been sold since 2001 when it was first introduced. The company provides a software package (RAPID) to view the WCE video, offering a bleeding detector feature based on red color. It provides a position estimator of the capsule inside the digestive tract. Additionally its multi-view feature gives a simultaneous view of two or four consecutive video frames in multiple windows. Finally a library of reference images (RAPID Atlas) is provided so that the user can have easy access to on-screen case images.

Although the company's software is a useful tool, the viewing of a WCE video is still a time consuming process (~ 2 hours), even for experienced gastroenterologists. In addition, the company's software has serious limitations (35% bleeding detection) and no capability of detecting polyps or ulcers according to gastroenterologists. Therefore, the need for computer aided model-methodology with robust detection performance on various conditions (blood, polyps, ulcers, etc) is clearly obvious.

Thus, our research studies have been successfully carried out on: a) the automatic detection of malignant intestinal features like polyps, bleeding, and abnormal regions (tumors); b) finding the boundaries of the digestive organs; and c) reducing the viewing-examination time with a robust registra-tion methodology. These studies have led to the development of the ATRC Video Toolbox (ATRC-VT).

ATRC-VT incorporates signal processing methods, color and image processing techniques, and artificial intelligence tools to detect blood-based abnormalities, polyps and ulcers in the small intestine. It is the first computer aided detection (CAD) software with multiple capabilities for WCE videos designed with a Graphics User Interface so that it is easy to use.

Committee:

Nikolaos Bourbakis, PhD (Advisor); Soon Chung, PhD (Committee Member); Thomas Hangartner, PhD (Committee Member); Yong Pei, PhD (Committee Member); Marios Pouagare, PhD (Committee Member)

Subjects:

Computer Science

Keywords:

capsule; endoscopy; imaging; medical; computer; graphics; computer vision; locomotion; bowel; digestive tract; gastroenterology;

Cui, ChenConvolutional Polynomial Neural Network for Improved Face Recognition
Doctor of Philosophy (Ph.D.), University of Dayton, 2017, Electrical and Computer Engineering
Deep learning is the state-of-art technology in pattern recognition, especially in face recognition. The robustness of the deep network leads a better performance when the size of the training set becomes larger and larger. Convolutional Neural Network (CNN) is one of the most popular deep learning technologies in the modern world. It helps obtain various features from multiple filters in the convolutional layer and performs well in the hand written digits classification. Unlike the unique structure of each hand written digit, face features are more complex, and many difficulties are existed for face recognition in current research field, such as the variations of lighting conditions, poses, ages, etc. So the limitation of the nonlinear feature fitting of the regular CNN appears in the face recognition application. In order to create a better fitting curve for face features, we introduce a polynomial structure to the regular CNN to increase the non-linearity of the obtained features. The modified architecture is named as Convolutional Polynomial Neural Network (CPNN). CPNN creates a polynomial input for each convolutional layer and captures the nonlinear features for better classification. We firstly prove the proposed concept with MNIST handwritten database and compare the proposed CPNN with regular CNN. Then, different parameters in CPNN are tested by CMU AMP face recognition database. After that, the performance of the proposed CPNN is evaluated on three different face databases: CMU AMP, Yale and JAFFE as well as the images captured in real world environment. The proposed CPNN obtains the best recognition rates (CMU AMP: 99.95%, Yale: 90.89%, JAFFE: 98.33%, Real World: 97.22%) when compared to other different machine learning technologies. We are planning to apply the state-of-art structures, such as inception and residual, to the current CPNN to increase the depth and stability as our future research work.

Committee:

Vijayan Asari (Advisor)

Subjects:

Artificial Intelligence; Bioinformatics; Computer Engineering; Electrical Engineering

Keywords:

Deep Learning, Convolutional Polynomial Neural Network, Face Recognition, Computer Vision, Image Processing

Floyd, Beatrice K.Vision-Based Techniques for Cognitive and Motor Skill Assessments
Master of Sciences (Engineering), Case Western Reserve University, 2012, EMC - Mechanical Engineering
This thesis presents computer vision algorithms and associated applications for automating cognitive and motor skill assessments. These assessments are used to diagnose cognitive and motor impairments and behavioral problems. Due to the high prevalence of such disorders and the limitation of traditional diagnosis methods, there is an urgent need for improved approaches. Automation through computer vision enables low cost comprehensive assessments that can be more extensively implemented, further precision in measurement, provide quantitative behavioral and performance data, record results electronically, and allow professionals to concentrate on other assessment factors. The presented algorithms include wrist tracking, object recognition and tracking, and gaze detection. These algorithms are applied to create an automated version of the Wechsler’s Block Design subtest, kinematical modeling of the upper extremities, methods of path accuracy evaluation, an automated system for the Soda Pop Coordination Test, a tracking/scoring system for Cup Stacking, and a demonstration of gaze tracking.

Committee:

Kiju Lee (Committee Chair); Merat Francis (Committee Member); Newman Wyatt (Committee Member); Prahl Joseph (Committee Member)

Subjects:

Computer Engineering; Computer Science; Mechanical Engineering; Psychological Tests; Psychology

Keywords:

Computer Vision; Image Processing; Automation; Performance Assessment; Fine-motor Skill Assessment; Cognitive Assessment; Wrist Tracking; Pose Estimation; Gaze Detection; Pose Estimation from Shape

Huggins, Kevin RobertComputer Vision Localization Based On Pseudo-Satellites
Master of Science, The Ohio State University, 2009, Electrical and Computer Engineering

Computer vision offers a unique opportunity to stand-alone navigation or use in combination with other systems. In this thesis, we propose new localization and navigation methods using computer vision techniques to detect the distance to targets and localize a platform based on the distance. There are multiple methods of navigation such as global positioning systems (GPS) and global systems for mobile communication (GSM). However, these methods of navigation rely on receiving information from extrinsic sources. GPS requires signals from 4 satellites in order to accurately localize the receiver. However, if signals from 4 satellites cannot be obtained, which could happen in urban areas or areas with high foliage, the position of the receiver cannot be calculated accurately. We propose to use a hybrid computer vision system with GPS when one or more of the satellites are unavailable. The pseudo-satellite approach uses a network of nodes. Each node has the possibility of acting as a pseudo-satellite for another node that does not have the necessary 4 signals from satellites.

We propose an approach to localization that uses computer vision to select the most probable current location based on preregistered local knowledge and new knowledge gathered as the platform moves in the environment. This computer vision approach uses multiple techniques to sense distance using a camera mounted on the agent. This distance information may then be integrated with the pseudo-satellite approach for increased accuracy. The computer vision approaches range from using predetermined knowledge of object size or location to using orientation of fixed objects and vehicle motion for determining distance and orientation. The approach relies on the camera parameters calculated from the camera calibration.

In this thesis, each technique to calculate distance is tested and the results are compared with the actual distances. A program that automatically detects objects in the environment and calculated distance is compared with the other methods of calculating distance by manually detecting the object. The program performs well and gives better error rates at the tested distances to objects.

Committee:

Yuan Zheng, PhD (Advisor); Bradley Clymer, PhD (Committee Member)

Subjects:

Electrical Engineering

Keywords:

Computer Vision; Pseudo-Satellite

Melikian, Simon HaigVisual Search for Objects with Straight Lines
Doctor of Philosophy, Case Western Reserve University, 2006, Electrical Engineering
I present a new method of visual search for objects that include straight lines. This is usually the case for machine-made objects. I describe existing machine vision search methods and show how my method of visual search gives better performance on objects that have straight lines. Inspired from human vision, a two-step process is used. First, straight line segments are detected in an image and characterized by their length, mid-point location, and orientation. Second, hypotheses that a particular straight line segment belongs to a known object are generated and tested. The set of hypotheses is constrained by spatial relationships in the known objects. I discuss implementation of my method and its performance and limitations in real and synthetic images. The speed and robustness of my method make it immediately applicable to many machine vision problems.

Committee:

Christos Papachristou (Advisor)

Subjects:

Computer Science

Keywords:

Visual search; Image search; Pattern recognition; Object recognition; Computer vision; Machine vision; Robotic guidance; Salient icons; Straight lines

Rudraraju, Prasad V.Motion parameter evaluation, camera calibration and surface code generation using computer vision
Master of Science (MS), Ohio University, 1989, Electrical Engineering & Computer Science (Engineering and Technology)

This research focuses on evaluation of motion parameters, Camera calibration and surface code generation using camera acquired images. The motion parameters of a moving object are studied using a static camera. Camera calibration is done by observing four non-coplanar static points, known in space. The principles of distance invariance between rigid points and angular invariance between fixed lines are used in the method developed to accomplish the goal. Because of the loss of the depth information in a 2-D image of a 3-D scene, each point in space contributes one unknown. The equations formulated are solved using the IMSL subroutine ZSPOW. The evaluated unknowns are used to compute the position in space which are subsequently utilized in the estimation of motion parameters and Camera calibration. The method benefits when the actual distance between points of observation is known in advance. Surface code is an object identifying feature. This may be used as an identifying feature for object recognition purposes. The code developed is simple, orientation independent and computationally faster.

Considering the error contributing factors in locating image points, camera distortions and measurements of actual distances the computed results compare favorably with actual values in case of motion estimation and Camera calibration. The surface codes for all the images worked with have been generated successfully. The results indicate that the method is practically feasible.

Committee:

G. Raju (Advisor)

Keywords:

Motion Parameter Evaluation; Camera Calibration; Surface Code Generation; Computer Vision

Diskin, YakovDense 3D Point Cloud Representation of a Scene Using Uncalibrated Monocular Vision
Master of Science (M.S.), University of Dayton, 2013, Electrical Engineering
We present a 3D reconstruction algorithm designed to support various automation and navigation applications. The algorithm presented focuses on the 3D reconstruction of a scene using only a single moving camera. Utilizing video frames captured at different points in time allows us to determine the depths of a scene. In this way, the system can be used to construct a point cloud model of its unknown surroundings. In this thesis, we present the step by step methodology of the development of a reconstruction technique. The original reconstruction process, resulting with a point cloud was computed based on feature matching and depth triangulation analysis. In an improved version of the algorithm, we utilized optical flow features to create an extremely dense representation model. Although dense, this model is hindered due to its low disparity resolution. As feature points were matched from frame to frame, the resolution of the input images and the discrete nature of disparities limited the depth computations within a scene. With the third algorithmic modification, we introduce the addition of the preprocessing step of nonlinear super resolution. With this addition, the accuracy of the point cloud which relies on precise disparity measurement has significantly increased. Using a pixel by pixel approach, the super resolution technique computes the phase congruency of each pixel’s neighborhood and produces nonlinearly interpolated high resolution input frames. Thus, a feature point travels a more precise discrete disparity. Also, the quantity of points within the 3D point cloud model is significantly increased since the number of features is directly proportional to the resolution and high frequencies of the input image. Our final contribution of additional preprocessing steps is designed to filter noise points and mismatched features, giving birth to the complete Dense Point-cloud Representation (DPR) technique. We measure the success of DPR by evaluating the visual appeal, density, accuracy and computational expense of the reconstruction technique and compare with two state-of-the-arts techniques. After the presentation of rigorous analysis and comparison, we conclude by presenting the future direction of development and its plans for deployment in real-world applications.

Committee:

Asari Vijayan, PhD (Committee Chair); Raul Ordonez, PhD (Committee Member); Eric Balster, PhD (Committee Member)

Subjects:

Electrical Engineering; Engineering

Keywords:

monocular vision; 3D Scene Reconstruction; Dense Point-cloud Representation; Point Cloud Model; DPR; Super Resolutoin; Vision Lab; University of Dayton; Computer Vision; Vision Navigation; UAV; UAS; UGV; RAIDER; Yakov Diskin; Depth Resolution Enhancement

Penzias, GregoryIdentifying the Histomorphometric Basis of Predictive Radiomic Markers for Characterization of Prostate Cancer
Master of Sciences (Engineering), Case Western Reserve University, 2017, Biomedical Engineering
Radiomics has shown promise for in vivo prediction of cancer risk, thus providing a potential avenue for reducing over-treatment and unnecessarily invasive biopsy-based diagnosis. Radiomics could be particularly beneficial for stratifying patients into different risk groups in the context of prostate cancer (PCa), for which limitations of current in vivo risk assessment result in over-diagnosis and over-treatment. Despite its promise, successful translation of radiomics into the clinic may require a more comprehensive understanding of the underlying morphologic tissue characteristics they reflect. Few studies, however, have attempted to establish the biological or histomorphometric basis for the performance of radiomics. Accomplishing this requires fusing the information obtained from the imaging modalities of radiology and histopathology, since the gold standard definition of PCa comes from histopathologic analysis of whole-mount specimens. The first step in performing this radiology-pathology fusion in PCa entails achieving spatial correspondence between preoperative in vivo magnetic resonance imaging (MRI) and ex vivo hematoxylin & eosin (H&E)-stained whole-mount radical prostatectomy specimens via deformable co-registration. Co-registration, however, requires whole-mount histology sections (WMHSs), which are not always feasible to obtain. In such cases, large specimens are cut into multiple smaller tissue fragments. This thesis presents work on two related modules of radiology-pathology fusion in PCa: First, a novel automated program called AutoStitcher, which reconstructs pseudo whole-mount histology sections (PWMHSs) by digitally stitching together multiple smaller tissue fragments, thus enabling co-registration with in vivo radiographic imagery. AutoStitcher reconstructed PWMHSs with less than 3% error relative to manually stitched PWMHSs. Second, comprehensive sets of radiomic features extracted from MRI and quantitative histomorphometric features from H&E were extracted and then spatially co-localized to characterize each tumor region. Correlative analysis revealed a set of promising predictive radiomic markers that could accurately distinguish low- from intermediate-/high-risk PCa and a set of QH features that may form their histomorphometric basis. Results were validated on an independent dataset from a different institution.

Committee:

Anant Madabhushi (Advisor); Satish Viswanath (Committee Member); David Wilson (Committee Member)

Subjects:

Biomedical Engineering; Computer Science; Engineering; Medical Imaging; Oncology; Radiology

Keywords:

radiomics; quantitative histomorphometry; prostate cancer; imaging biomarkers; digital pathology; data fusion; computer vision; image reconstruction; image stitching

Youssef, Menatoallah M.Hull Convexity Defect Features for Human Action Recognition
Doctor of Philosophy (Ph.D.), University of Dayton, 2011, Electrical Engineering

Human action recognition is a rapidly developing field in computer vision. Accurate algorithmic modeling of action recognition must contend with a multitude of challenges. Machine vision and pattern recognition algorithms can be used to aid in the identification of these actions. In recent years research has focused on recognizing complex actions using simple features. Simple cases of action recognition, wherein one individual is captured performing a single action, form the foundation for developing more complex scenarios in real environments. This can be especially useful for surveillance of public locations such as subways, shopping centers, or parking lots in order to reduce crime, monitor traffic flow, and offer security in general. An effective action recognition algorithm must address the following challenges that affect feature extraction for accurate representation : non-rigidity, spatial-variance, temporal-variance, camera perspective. Where face detection seeks to identify the location of an individual's face, activity recognition seeks to recognize the motion or action of an individual. There is generally a commonality of features in the true positive set with face recognition; certain rigid features are present on every human face. Action recognition, on the other hand, must deal with the non-rigidity of the human body. The arms and legs can be at a number of positions relative to one another, and at varying distances and angles. These relative positions describe actions or intermediary poses.

We consider developing a taxonomic shape driven algorithm to solve the problem of human action recognition and develop a new feature extraction technique using hull convexity defects. To test and validate this approach, we use silhouettes of subjects performing ten actions from a commonly used video database by action recognition researchers. A morphological algorithm is used to filter noise from the silhouette. A convex hull is then created around the silhouette frame, from which convex defects will be used as the features for analysis. A complete feature consists of thirty individual values which represent the five largest convex hull defects areas. A consecutive sequence of these features form a complete action. Action frame sequences are preprocessed to separate the data into two sets based on perspective planes and bilateral symmetry. Features are then normalized to create a final set of action sequences. We then formulate and investigate three methods to classify ten actions from the database. Testing and training of the nine test subjects is performed using a leave one out methodology. Classification utilizes both PCA and minimally encoded neural networks. Performance evaluation results show that the Hull Convexity Defect Algorithm provides comparable results with less computational complexity. This research can lead to a real time performance application that can be incorporated to include distinguishing more complex actions and multiple person interaction.

Committee:

Dr. Vijayan Asari, PhD (Committee Chair); Dr. Eric Balster, PhD (Committee Member); Dr. Keigo Hirakawa, PhD (Committee Member); Dr. Donald Kessler, PhD (Committee Member)

Subjects:

Electrical Engineering

Keywords:

Human Action Recognition; Computer Vision; Biometrics; Convex Hulls

Unsalan, CemMultispectral satellite image understanding
Doctor of Philosophy, The Ohio State University, 2003, Electrical Engineering
A problem of major interest to regional planning organizations, disaster relief agencies, and the military is the identification and tracking of land development across large scale regions, and over time. We develop an autonomous image analysis system to understand land development, especially residential and urban building organizations from satellite images. We introduce a set of measures based on straight lines to assess land development levels in high resolution satellite images. Urban areas exhibit a preponderance of straight line features. Rural areas produce line structures in more random spatial arrangements. We use this observation to perform an initial triage on the image to restrict the attention of subsequent, more computationally intensive analyses. Vegetation indices have been used extensively to estimate the vegetation density from satellite and airborne images for many years. We use these as the multispectral information for classification and house and road extraction. We focus on the normalized difference vegetation index NDVI and introduce a statistical framework to analyze and extend it. Using the established statistical framework, we introduce new a group of shadow-water indices. We then extend our straight line based measures by developing a synergistic approach that combines structural and multispectral information. In particular, the structural features serve as cue regions for multispectral features. After the initial classification of regions, we introduce computationally more expensive but more precise graph theoretical measures over grayscale images to detect residential regions. The graphs are constructed using lines as vertices, while graph edges encode their spatial relationships. We introduce a set of measures based on various properties of the graph. These measures are monotonic with increasing structure (organization) in the image. We present a theoretical basis for the measures. Having detected the residential regions, we introduce a novel system to detect houses and street networks in these. We extensively use the multispectral information and graph theory to extract houses and road networks. We evaluated the performance of each step statistically and obtained very promising results. Especially, detection performances in house and street detection in residential regions is noteworthy. These results indicate the functionality of our satellite image understanding system.

Committee:

Kim Boyer (Advisor)

Keywords:

land classification; house detection; road detection; building detection; computer vision

Deo, Ashwin P.A Fast Localization Method Based on Distance Measurement in a Modeled Environment.
Master of Sciences (Engineering), Case Western Reserve University, 2009, EECS - Computer and Information Sciences
Accurate localization is one of the core requirements for autonomous vehicle navigation. This thesis presents a localization algorithm that operates on an a priori map represented as a collection of line segments. This method treats the entire map as a solid body template and performs a fit of this template to the available sensory data. Combining an analytic translation-optimization technique with an iterative heading search technique enabled an accurate and computationally efficient method for finding the localization parameters. This algorithm is further extended to deal with singularities of situations of partial observability. Statistical analysis of simulations with synthetic data indicated that the algorithm was able to handle ideal as well as noisy data with good accuracy. Its performance was further evaluated using actual sensory data from LIDAR. LIDAR-based localization proved to be an effective technique for localization a vehicle in complex indoor environments. The thesis also explores the capabilities of stereo vision and describes the initial assessment of replacing LIDAR with stereo vision for localization.

Committee:

Wyatt Newman (Committee Chair); Frank Merat (Committee Member); Cenk Cavusoglu (Committee Member)

Subjects:

Computer Science; Electrical Engineering; Robots

Keywords:

Mobile Robotics; Localization; Computer Vision; Template Matching

Jackovitz, Kevin S.Integrated Coarse to Fine and Shot Break Detection Approach for Fast and Efficient Registration of Aerial Image Sequences
Master of Science (M.S.), University of Dayton, 2013, Electrical Engineering
Image registration is a task that has been focused on in many fields that deal with object detection and tracking on video sequences. When tracking any object throughout a scene, more often than not, image registration is used to align video frames to help segment moving objects from the background. With that in mind, a new registration method employing a two stage approach is proposed that efficiently registers aerial imagery. The proposed coarse to fine approach uses a combination of two efficient algorithms: Speeded Up Robust Features (SURF) for the generation of an estimated homography and the Efficient Second-Order Minimization (ESM) for fine tuning the homography generated from the coarse SURF method. Experiments are performed on several different aerial image databases, which vary in both size and resolution. The proposed algorithm proves to be effective and accurate when dealing with the changing databases; however, there are times when registration fails, specifically when very large warping parameters occur between two scenes. When registering consecutive image pairs within a sequence of images, accurate registration is needed to support many of the tracking algorithm's downstream processes. When one or several bad frames are present within a sequence of images, it becomes necessary to exclude these frames from use. A "shot" is a sequence of frames within a video sequence where an object or objects are tracked consistently. In registration terms, it is a sequence of images that have been registered correctly without disrupting the tracks for targets. A shot break occurs when one frame cannot be linked with a transformation homography to another frame within an image sequence. The goal of a shot break detection algorithm is to exclude bad frames from use and detect when shot breaks occur. This thesis implements several internal and external shot break detection algorithms where bad frames and shot breaks are detected within a sequence of images. Internal checks occur within the registration algorithm before producing the transformation homography, to give a pass/fail on any given frame. External checks are used to compare the level of overlap after the homography has been applied to align one image to another. Shot break detection algorithms are tested on a sequence of images, using the proposed registration algorithm to register frame-to-frame. The proposed techniques show good results for detecting bad frames within a sequence, while maintaining speed and accuracy on the good frames. Research is progressing to increase the speed of registration and detection accuracy of shot break detection algorithms.

Committee:

Vijayan Asari, Ph.D. (Committee Chair); Juan Vasquez, Ph.D. (Committee Member); Eric Balster, Ph.D. (Committee Member)

Subjects:

Electrical Engineering; Engineering

Keywords:

Image Registration; Object Tracking; Shot Break Detection; SSIM; Aerial Image; Registration; Wide Area Motion Imagery; Full Motion Imagery; Computer Vision; Image Sequence Registration

Ballard, Brett S.Feature Based Image Mosaicing using Regions of Interest for Wide Area Surveillance Camera Arrays with Known Camera Ordering
Master of Science (M.S.), University of Dayton, 2011, Electrical Engineering
Today, modern surveillance systems utilizing camera arrays can capture several square miles of ground activity at high resolution from a single aircraft. A camera array uses multiple cameras to capture images synchronously with partial overlap between cameras' fields of view. This allows a wide area to be monitored continuously in real time by image analysts or processed for information such as object identification and location tracking. The task of combining these images from each individual camera into one large image containing all of the images' views of the scene activity is commonly called image mosaicing in the field of computer vision. Though the process of image mosaicing is not new, what makes image mosaicing a topic of current research is the difficulty and variety of both problems and solutions. The objective of this thesis is to demonstrate the most suitable system to mosaicing images captured by wide area surveillance camera arrays with known camera ordering by using regions of interest combined with a feature based approach. The proposed system utilizes algorithms for feature extraction, matching, and estimation. The key difference between the proposed mosaicing system and prior successful mosaicing systems within other application domains is the use of known camera ordering. In many previously researched mosaicing systems no assumption is made for camera order, and in fact in some applications there is no assumption that images may even be viewing the same scene at all. However, for applications involving wide area surveillance camera arrays these assumptions are perfectly valid. This allows bounded regions of interest near the appropriate image borders to be used which is demonstrated in the proposed system to increase performance in both pixel accuracy and mosaic computation times over the more generalized mosaicing approach.

Committee:

Eric Balster, PhD (Committee Chair); Vijayan Asari, PhD (Committee Member); John Loomis, PhD (Committee Member)

Subjects:

Electrical Engineering; Remote Sensing; Scientific Imaging

Keywords:

regions of interest; roi; known camera ordering; image mosaicing; feature based homography; image stitching; stereo computer vision; wide area surveillance; camera array;

Ding, LeiFrom Pixels to People: Graph Based Methods for Grouping Problems in Computer Vision
Doctor of Philosophy, The Ohio State University, 2010, Computer Science and Engineering

In this dissertation, we study grouping problems in computer vision using graph-based machine learning techniques. Grouping problems abound in computer vision and are typically challenging ones in order to generate perceptually and semantically consistent results. In the context of this dissertation, we strive to (1) group image pixels into meaningful objects and backgrounds; (2) group interacting people present in a video into sound social communities. Traditionally, in a graph-based formulation, the entities (e.g. image pixels) are treated as graph vertices and their interrelations are encoded in a weighted adjacency matrix of the graph. In this dissertation, we go beyond standard graph construction methods by building on probabilistic image hypergraphs and learned social graphs (or social networks) for the two parts of work respectively. Learning on graphs results in labeling of entities. In our work, graph based smoothness and modularity measures are examined and adapted to the problems under study.

Under this general graph-based framework, the first pursued direction is interactive image segmentation, or the problem of grouping image pixels into meaningful objects and their backgrounds, given a limited number of user-supplied seeds. Our contributions in this direction include the probabilistic hypergraph image model (PHIM) to address higher-order relations among pixels in segment labels, which are commonly ignored in competing approaches. To further alleviate the dependence of interactive segmentation on user-supplied seeds, we introduce diffusion signatures derived from salient boundaries and present a framework for automatically introducing new seeds at critical image locations, in order to enhance segmentation results. Both proposed frameworks are extensively tested on a standard image dataset and achieved excellent quantitative and qualitative results in segmentation.

In the second direction, we contribute an automatic framework to infer relations among actors from videos. In particular, we propose a principled graph-based affinity learning method, which synthesizes both co-occurrence information among actors and local grouping cue estimates at the scene level in order to make informed decisions. Once the pairwise affinities between actors are learned from the video content using visual and auditory features, we perform social network analysis based on modularity measures to detect communities, which are groups of actors. Experiments on a dataset of ten movies that we collected have shown promising results. Moreover, the proposed framework has considerably outperformed baseline methods not using visual or auditory features, suggesting the importance of audiovisual cues in high-level relational understanding tasks.

In summary, built on a graph-based learning framework, this dissertation makes contributions to grouping problems in computer vision. Specifically, we have proposed effective techniques to solve problems in both low-level analysis of images (segmentation) and high-level understanding of videos (relational inference).

Committee:

Mikhail Belkin (Committee Chair); Alper Yilmaz (Committee Co-Chair); DeLiang Wang (Committee Member); Simon Dennis (Committee Member)

Subjects:

Computer Science

Keywords:

Graph-based Machine Learning; Computer Vision

Cooper, Lee Alex DonaldHigh Performance Image Analysis for Large Histological Datasets
Doctor of Philosophy, The Ohio State University, 2009, Electrical and Computer Engineering

The convergence of emerging challenges in biological research and developments in imaging and computing technologies suggests that image analysis will play an important role in providing a better understanding of biological phenomenon. The ability of imaging to localize molecular information is a key capability in the post-genomic era and will be critical in discovering the roles of genes and the relationships that connect them. The scale of the data in these emerging challenges is daunting; high throughput microscopy can generate hundreds of gigabytes to terabytes of high-resolution imagery even for studies limited in scope to a single gene or interaction. In addition to the scale of the data, the analysis of microscopic image content presents significant problems for the state-of-the-art in image analysis.

This dissertation addresses two significant problems in the analysis of large histological images: reconstruction and tissue segmentation. The proposed methods form a framework that is intended to provide researchers with tools to explore and quantitatively analyze large image datasets.

The works on reconstruction address several problems in the reconstruction of tissue from sequences of serial sections using image registration. A scalable algorithm for nonrigid registration is presented that features a novel method for the matching small nondescript anatomical features using geometric reasoning. Methods for the nonrigid registration of images with different stains are presented for two application scenarios. Correlation sharpness is proposed as a new measure for image similarity, and is used to map tumor suppressor gene expression to structure in mouse mammary tissues. An extended process of geometric reasoning based on the matching of cliques of anatomical features is presented and demonstrated for the nonrigid registration of immunohistochemical stain to hemotoxylin and eosin stain for human cancer images. Finally, a method for the incorporation of structural constraints into the reconstruction process is proposed and demonstrated on the reconstruction of ducts in mammary tissues.

The work on tissue segmentation focuses on the use of statistical geometrical methods to describe the spatial distributions of biologically meaningful elements such as nuclei in tissue. The two point correlation function is demonstrated to be an effective feature for the segmentation of tissues, and is shown to possess a peculiar low-dimensional distribution in feature space that permits unsupervised segmentation by robust methods. The relationship between two-point functions for proximal image regions is derived and used to accelerate computation, resulting in a 7-68x improvement over a naive FFT-based implementation.

In addition to the methods proposed for reconstruction and segmentation, a significant portion of this dissertation is devoted to applying high performance computing to enable the analysis of large datasets. In particular, multi-node parallelization as well as multi-core and general purpose computing on graphics processing are used to form a heterogeneous multiprocessor platform that is used to demonstrate the segmentation and reconstruction methods on images up to 62K × 23K in size.

Committee:

Bradley Clymer (Advisor); Kun Huang (Advisor); Ashok Krishnamurthy (Committee Member)

Subjects:

Electrical Engineering

Keywords:

computer vision; image processing; bioinformatics; bioimaging; GPU; microscopy; high performance computing

KAIMAL, VINOD GOPALKRISHNAA NEURAL METHOD OF COMPUTING OPTICAL FLOW BASED ON GEOMETRIC CONSTRAINTS
MS, University of Cincinnati, 2002, Engineering : Electrical Engineering
Determining optical flow is a fundamental problem in computer vision and image analysis. Optical flow is the projection of three dimensional motion of objects onto a plane, estimated from a sequence of two dimensional images. However, as this is an ill posed problem, i.e. has no unique solution based on available information thus requiring additional constraints to make the problem well posed. We propose an algorithm based on feature matching using Hopfield Networks to determine the optical flow field between a sequence of images. We formulate the correspondence between features extracted from the two images as an optimization problem using a new robust disparity measure. The Hopfield network is then used to minimize the cost function and obtain an optimal match between the features in successive frames. This thesis offers a review of relevant literature and a detailed description of the algorithm. We then evaluate the performance of the algorithm on a variety of test sequences, both artificial and real. The results are found to be comparable to those obtained by other feature based algorithms, which suggests the use of Hopfield type networks in practical vision algorithms.

Committee:

Dr. Ali Minai (Advisor)

Keywords:

optical flow; computer vision; neural networks; Hopfield networks; image processing

Treaster, Delia E.An investigation of postural and visual stressors and their interactions during computer work
Doctor of Philosophy, The Ohio State University, 2003, Industrial and Systems Engineering
The continuing dominance of computers and the rising chorus of complaints from computer users highlight the importance of understanding the risks associated with computer use. Particularly challenging are the issues of eyestrain and muscle pain, the latter particularly puzzling because of the low force levels and static postures of computer work. To study eyestrain and muscle pain during computer work, a multi-disciplinary approach was developed, using techniques from three diverse fields: biomechanics, myofascial pain and vision. A laboratory study was used to examine the effects of the independent variables, postural and visual stress, during a 30-minute typing task. Sixteen healthy females (ages 19-29) participated in the experiment; all were touch-typists. The study design was a 2 x 2 repeated measures, with randomized order of testing. The dependent variables included development of trigger points in the upper trapezius, subjective measures of discomfort, visual function, and surface electromyography (EMG). Trapezius EMG data were collected at locations of known trigger points. This provided information about EMG as the trigger points developed during the experiment. An experienced myofascial specialist performed onsite examination to identify the trigger points before and after each experimental session. Cyclical changes in the EMG median frequency that occurred throughout the experiment were quantified. These cyclic changes provided information regarding motor unit rotation patterns. A method for quantifying eyestrain through EMG changes in the obicularis oculi was also developed. There was a significant interaction between postural and visual factors on both the perception of eyestrain and on the trapezius EMG. In particular, the high visual stress condition, when combined with the low postural stress condition, produced fewer cyclic changes in median frequency (i.e. less motor unit rotation), and greater trigger point pain. A hypothesized injury pathway for the development of myofascial trigger points was developed. The role of high visual stress, perhaps mediated through the stress reaction of the autonomic nervous system, was also postulated. The importance of good ergonomics in terms of location of computer components as well as visual parameters is highlighted. The findings of the interactions show the value of a multi-disciplinary approach to a complex problem.

Committee:

William Marras (Advisor)

Keywords:

ergonomics; electromyography; EMG; computer vision syndrome; CVS; myofascial pain syndrome; trigger points; computer work; low level static exertions; Cinderella fibers; motor unit rotation; motor unit recruitment; eyestrain; asthenopia; glare; posture

Keck, Mark A.Occlusion Recovery and Reasoning for 3D Surveillance
Doctor of Philosophy, The Ohio State University, 2009, Computer Science and Engineering

In this work we propose algorithms to learn the locations of static occlusions and reason about both static and dynamic occlusion scenarios in multi-camera scenes for 3D surveillance (e.g., reconstruction, tracking). We will show that this leads to a computer system which is able to more effectively track (follow) objects in video when they are obstructed from some of the views. Because of the nature of the application area, our algorithm willbe under the constraints of using few cameras (no more than 3) that are configured wide-baseline.

Our algorithm consists of a learning phase, where a 3D probabilistic model of occlusions is estimated per-voxel, per-view over time via an EM-style framework. In this framework, at each frame the visual hull of the foreground objects (people) is computed via a Markov Random Field that integrates the occlusion model. The model is then updated at each frame using this solution, providing an iterative process that can accurately estimate the occlusion model by accumulating temporal information and overcome the few-camera constraint. We demonstrate the application of such a model to a number of areas, including visual hull reconstruction, 3D tracking, and the reconstruction of the occluding structures themselves.

Committee:

James Davis, Ph.D. (Advisor); Rick Parent, Ph.D. (Committee Member); James Todd, Ph.D. (Committee Member)

Subjects:

Computer Science

Keywords:

Computer Vision; Occlusion Recovery, Tracking

McMichael, Scott ThomasLane Detection for DEXTER, an Autonomous Robot, in the Urban Challenge
Master of Sciences, Case Western Reserve University, 2008, Computer Engineering
This thesis describes the lane detection system developed for the autonomous robot DEXTER in the 2007 DARPA Urban Challenge. Though DEXTER was capable of navigating purely off of GPS signals, it often needed to drive in areas where GPS navigation could not be trusted completely. In these areas it was necessary to use a method of automatically detecting the lane of travel so that DEXTER could drive properly within it. The developed system functions by merging the outputs of a number of independent road detection modules coming from several sensors into a single drivable output path. This sensor derived path is compared with the map derived path in order to produce an optimal output based on the relative confidences of the two information sources. The full lane detection system is able to adaptively drive according to the best information source and perform well in a variety of diverse driving environments.

Committee:

Wyatt Newman (Advisor)

Keywords:

autonomous robot; DARPA Urban Challenge; lane detection; road detection; computer vision; sensor fusion

Johnson, AndrewFragment Association Matching Enhancement (FAME) on a Video Tracker
Master of Science in Computer Engineering (MSCE), Wright State University, 2014, Computer Engineering
In the field of surveillance, algorithms are developed to extract meaningful information out of a video feed captured via a camera. One type of algorithm used in the field of surveillance is a tracking algorithm. A tracking algorithm allows a user to watch the movement of an object in the camera's field of view. The tracker used in this thesis research is a feature aided tracker (FAT). The FAT uses both features and kinematics to generate tracks. However, camera movement will affect the tracker's ability to accurately track an object which poses a problem to the tracker. Specifically, the camera will introduce the multi-fragmentation problem to the tracker. Multi-fragmentation occurs when an object is marked with two tracks instead of a single track. By marking the object with two tracks, the tracker's performance and accuracy will decrease. This thesis research proposes the idea of matching features of small foreground objects (fragments) to create larger foreground objects. A pair of fragments will have their features calculated into a score. If the fragment pair's score is below a specific threshold, they will be matched to create a larger fragment. Many of the concepts used to design this tracking algorithm (FAME) stem from the fields of computer vision, pattern recognition, and tracking.

Committee:

Thomas Wischgoll, Ph.D. (Advisor); Juan Vasquez, Ph.D. (Committee Member); Arthur Goshtasby, Ph.D. (Committee Member)

Subjects:

Computer Engineering

Keywords:

Tracking, Computer Vision, Pattern Recognition

Li, YueActive Vision through Invariant Representations and Saccade Movements
Master of Science (MS), Ohio University, 2006, Electrical Engineering & Computer Science (Engineering and Technology)

This thesis presents an innovative approach to pattern recognition, by using self-organized, invariant representations integrating continuous observation and saccade movements. This biologically motivated approach can achieve visual perception through a retina like sampling of high resolution images with lower resolution artificial retina.

The neural network uses hierarchical feedback structures to build object representations, self-organizes invariant transformations, while iterates on the images received from the retina model. The network identifies the whole image by using winner-take-all scheme through temporal association of sufficiently accurate saccades. By using our invariance building scheme, the network can identify different views of the same object.

Committee:

Janusz Starzyk (Advisor)

Keywords:

computer vision; pattern recognition; invariant representation; saccade movements; machine intelligence; retina sampling

Nedrich, MatthewDetecting Behavioral Zones in Local and Global Camera Views
Master of Science, The Ohio State University, 2011, Computer Science and Engineering
We present a complete end-to-end framework to detect and exploit entry and exit regions in video using behavioral models for object trajectories. We first describe how weak tracking data (short and frequently broken tracks) may be utilized to hypothesize entry and exit regions by constructing the weak tracks into a more usable set of "entity" tracks. The entities provide a more reliable set of entry and exit observations which are clustered to produce a set of potential entry and exit regions within a scene. A behavior-based reliability metric is then used to score each potential entry and exit region, and unreliable regions are removed. Using the detected regions, we then present a method to learn scene occlusions and causal relationships between entry-exit pairs. An extension is also presented that allows our entry/exit detection algorithm to detect global entry and exit regions with respect to the viewspace of a pan-tilt-zoom camera. We provide thorough evaluation of our local and viewspace region discovery approaches, including quantitative experiments, and compare our local method to existing approaches. We also provide experimental results for our region exploitation methods (occlusion discovery and entry-exit region relationships), and demonstrate that they may be incorporated to aid in tasks such as tracking and anomaly detection.

Committee:

James Davis, Prof. (Advisor); Richard Parent, Prof. (Committee Member)

Subjects:

Computer Science

Keywords:

scene modeling; computer vision; scene understanding

Next Page