Search Results (1 - 9 of 9 Results)

Sort By  
Sort Dir
 
Results per page  

Abounia Omran, BehzadApplication of Data Mining and Big Data Analytics in the Construction Industry
Doctor of Philosophy, The Ohio State University, 2016, Food, Agricultural and Biological Engineering
In recent years, the digital world has experienced an explosion in the magnitude of data being captured and recorded in various industry fields. Accordingly, big data management has emerged to analyze and extract value out of the collected data. The traditional construction industry is also experiencing an increase in data generation and storage. However, its potential and ability for adopting big data techniques have not been adequately studied. This research investigates the trends of utilizing big data techniques in the construction research community, which eventually will impact construction practice. For this purpose, the application of 26 popular big data analysis techniques in six different construction research areas (represented by 30 prestigious construction journals) was reviewed. Trends, applications, and their associations in each of the six research areas were analyzed. Then, a more in-depth analysis was performed for two of the research areas including construction project management and computation and analytics in construction to map the associations and trends between different construction research subjects and selected analytical techniques. In the next step, the results from trend and subject analysis were used to identify a promising technique, Artificial Neural Network (ANN), for studying two construction-related subjects, including prediction of concrete properties and prediction of soil erosion quantity in highway slopes. This research also compared the performance and applicability of ANN against eight predictive modeling techniques commonly used by other industries in predicting the compressive strength of environmentally friendly concrete. The results of this research provide a comprehensive analysis of the current status of applying big data analytics techniques in construction research, including trends, frequencies, and usage distribution in six different construction-related research areas, and demonstrate the applicability and performance level of selected data analytics techniques with an emphasis on ANN in construction-related studies. The main purpose of this dissertation was to help practitioners and researchers identify a suitable and applicable data analytics technique for their specific construction/research issue(s) or to provide insights into potential research directions.

Committee:

Qian Chen, Dr. (Advisor)

Subjects:

Civil Engineering; Comparative Literature; Computer Science

Keywords:

Construction Industry; Big Data; Data Analytics; Data mining; Artificial Neural Network; ANN; Compressive Strength; Environmentally Friendly Concrete; Soil Erosion; Highway Slope; Predictive Modeling; Comparative Analysis

Pickering, Ethan MEDIFES 0.4: Scalable Data Analytics for Commercial Building Virtual Energy Audits
Master of Sciences (Engineering), Case Western Reserve University, 2016, EMC - Mechanical Engineering
Energy Diagnostics Investigator for Efficiency Savings (EDIFES) has been developed for scalable data analytics to conduct virtual energy audits on commercial buildings. Built as a software package in R, EDIFES ingests building electricity data and readily available weather data, applying various data analytics to determine building markers, characteristics, and operational tendencies. Through these analyses building systems are identified, including Heating Ventilation and Air Conditioning (HVAC), lighting, and plug load or other equipment, with characteristics such as load and system scheduling. Once building systems have been identified, EDIFES conducts virtual energy audits to diagnose efficiency issues, determines the impact (i.e. return-on-investment or payback) of potential retrofit actions (e.g. rescheduling HVAC to occupied hours or conducting a lighting retrofit). After this stage, it can be used for measurement and verification (M\&V) or continuous commissioning. Six buildings are presented in this thesis.

Committee:

Alexis Abramson, PhD (Committee Chair); Roger French, PhD (Committee Co-Chair); Joseph Prahl, PhD (Committee Member)

Subjects:

Civil Engineering; Computer Science; Energy; Engineering; Mechanical Engineering

Keywords:

building energy efficiency; data analytics; virtual energy audit; disaggregation; electricity consumption; commercial buildings; time series decomposition; statistical modeling; regression analysis; R; HVAC; EDIFES

Awodokun, OlugbengaClassification of Patterns in Streaming Data Using Clustering Signatures
MS, University of Cincinnati, 2017, Engineering and Applied Science: Electrical Engineering
Streaming datasets often pose a myriad of challenges for machine learning algorithms, some of which include insufficient storage and changes in the underlying distributions of the data during different time intervals. This thesis proposes a hierarchical clustering based method (unsupervised learning) for determining signatures of data in a time window and thus building a classifier based on the match between the observed clusters and known patterns of clustering. When new clusters are observed, they are added to the collection of possible global list of clusters, used to generate a signature for data in a time window. Dendrograms are created from each time window, and their clusters were compared to a global list of clusters. The global clusters list is only updated if none of the existing global clusters that can model data points in any later time window. The global clusters were then used in the testing phase to classify novel data chunks according to their Tanimoto similarities. Although the training samples were only taken from 20% of the entire KDD Cup 99 dataset, we validated our approach by using test data from different regions of the datasets at multiple intervals and the classifier performance achieved was comparable to other methods that had used the entire datasets for training.

Committee:

Raj Bhatnagar, Ph.D. (Committee Chair); Gowtham Atluri (Committee Member); Nan Niu, Ph.D. (Committee Member)

Subjects:

Computer Science

Keywords:

data mining;hierarchical clustering;unsupervised learning;data analytics;machine learning;intrusion detection

Pidaparthy, HemanthRecognizing and Detecting Errors in Exercises using Kinect Skeleton Data
Master of Science, University of Akron, 2015, Computer Engineering
A novel approach to recognizing and correcting errors in exercise activity based on skeletal joint data obtained from the Kinect 2.0 sensor is presented. Many approaches in the literature for analyzing human motion focus on training a classifier to recognize and/or rank the motions. While effective in some situations, the computational costs of training the models, the unavailability of reference motions and the inability to provide real-time guidance and feedback limit the utility of such approaches for empowering wellness management. A classification technique is used to recognize exercises and a geometric characterization of poses is used to detect errors in the recognized exercises. All the features used are extracted from the time-series data collected from a Microsoft Kinect 2.0 camera. Expert domain knowledge was easily integrated to identify errors in exercise performance. The simplicity and the low computational costs, make this approach useful for providing real-time feedback and guidance to participants, thus improving exercise adherence. Experimental results that demonstrate the viability of the approach are presented. In the future, this approach can be extended to a wider range of exercises and similar techniques can be applied to address related problems in rehabilitation, surveillance and remote user interaction.

Committee:

Shivakumar Sastry, Dr. (Advisor); Forrest Bao, Dr. (Committee Member); Jin Kocsis, Dr. (Committee Member); Victor Pinheiro, Dr. (Committee Member)

Subjects:

Computer Engineering

Keywords:

Microsoft Kinect Sensor, Personalized Wellness Management, Exercise Data Analytics

Khasawneh, Ahmad AliGUIDELINES FOR COMPARING INTERVENTIONS, PREDICTING HIGH-RISK PATIENTS, AND CONDUCTING OPTIMIZATION FOR EARLY HF READMISSION
Doctor of Philosophy, University of Akron, 2017, Mechanical Engineering
Reducing 30-day readmission for certain chronic diseases has gained healthcare provider’s attentions especially when the Center for Medicare and Medicaid Services (CMS) started penalizing hospitals for excess readmissions. Hospital readmission reduction program (HRRP) was established by CMS in 2012 and released in 2013 with 1% penalty on the total CMS reimbursement. This penalty increased in 2014 and 2015 to be at maximum 2% and 3% respectively. This study focuses on Congestive Heart Failure (CHF) which has the highest readmission rate faced with the financial impact of this Program. Our research effort on reducing preventable readmission is divided into three main parts: comparing the effectiveness of intervention strategies, finding the characteristics of patients at high-risk to be readmitted, and combining the outcomes of the first two parts to target the right patient with the right and cost-effective actions. Regarding the effectiveness of the most widely used intervention strategies in reducing preventable early readmission rate, several techniques and approaches have been implemented in this work to investigate, analyze, and compare the role of those interventions including Analytical Hierarchy Process (AHP), descriptive model, visualization, statistical analysis, and Lean Six-Sigma (LSS). More than thirty-five studies were carefully collected and analyzed to get the needed data for this research. The overall results showed that educate patients/caregivers (focusing on “Teach Back”) as prior at discharge strategy and home visit as post-discharge strategy are the most recommended strategies followed by telephone and discharge planning and/or instructions (using clear instruction sheets) intervention strategies. Readmission predictive modeling is one of the main proposed readmission reduction methods that have been extensively researched in the recent years. However, little has been done to systematically synthesize and analyze the results from the existing literature. Therefore, in this research initiative, the results from more than 40 studies have been collected and used to identify the most significant variables in predicting readmissions for Congestive Heart Failure (CHF) patients. Furthermore, CHF readmission data from two community hospitals in Northeast Ohio were analyzed and compared with these findings. The outcomes of implementing numerous predictive models showed a good match. Multiple/univariate logistic regression and univariate chi-square tests were used to identify the characteristics of patients at high-risk for readmission. The results showed that “severity of illness”, “mortality risk”, “type of payer”, “previous admission”, and “diabetes” seem to be significant predictors for readmission. combining the finding of those areas of research is still unsearched or not being released clearly. Therefore, cost optimization model has been developed in this research to systematically study the effectiveness of readmission predictive model and its financial impacts on reducing readmissions through various intervention strategies. The cost optimization model considers few key factors, such as “revenue per readmission”, “national readmission rate”, “current readmission rate”, “CMS penalty”, and “the number of high and low-risk patients” that is extracted from the confusion matrix, an output from the predictive model. The results are summarized in a set of guidelines that help hospitals in selecting the intervention strategies with the target patient population for the optimal financial gain.

Committee:

Shengyong Wang, Dr. (Advisor); Chen Ling, Dr. (Committee Member); Gregory Morscher , Dr. (Committee Member); Yilmaz Sozer, Dr. (Committee Member); Nao Mimoto, Dr. (Committee Member)

Subjects:

Engineering; Health Care; Health Care Management; Industrial Engineering; Management; Statistics

Keywords:

HF Readmission, Comparing Interventions, Analytical Hierarchy Process AHP, Lean Six-Sigma LSS, Predictive Models, Data Analytics, Logistic Regression, Chi-Square, Optimization Model,

Li, YuanxuHealthyLifeData Analytics: A DATA ANALYTICS TOOL FOR THE HealthyLifeHRA HEALTH RISK ASSESSMENT SYSTEM
Master of Sciences, Case Western Reserve University, 2016, EECS - Computer and Information Sciences
Traditional HRA (Health Risk Assessment) tools mostly focus on providing questionnaires and generating reports to the users. However, due to the need for more detailed information on the relationships between people's lifestyles and health risks, a new data analytics tool for HRA is necessary. This thesis proposes and implements a family of data analytics tool, as part of HealthyLife HRA application. It consists of Data Analytics A – Population and Range Based Aggregation and Visualization Queries, Data Analytics B – Time Series Queries, and Data Analytics C – Single User Targeted Time Series Queries. The tool has three front-end graphical interfaces for each family member and a back-end execution engine. It enables the users to specify general and time-series queries in a simple yet expressive way, without any previous knowledge of the database structure and SQL query. Visualization functionality is also provided as part of the tool.

Committee:

Gultekin Ozsoyoglu (Advisor)

Subjects:

Computer Science; Health Care

Keywords:

health risk assessment; data analytics; web query interface; SQL; health care

Aring, Danielle CIntegrated Real-Time Social Media Sentiment Analysis Service Using a Big Data Analytic Ecosystem
Master of Computer and Information Science, Cleveland State University, 2017, Washkewicz College of Engineering
Big data analytics are at the center of modern science and business. Our social media networks, mobile devices and enterprise systems generate enormous volumes of it on a daily basis. This wide range of availability provides many organizations in every field opportunities to discover valuable intelligence for critical decision-making. However, traditional analytic architectures are insufficient to handle unprecedentedly big volume of data and complexity of data processing. This thesis presents an analytic framework to combat unprecedented scale of big data that performs data stream sentiment analysis effectively in real time. The work presents a Social Media Big Data Sentiment Analytics Service System (SMBDSASS). The architecture leverages Apache Spark stream data processing framework, coupled with a NoSQL Hive big data ecosystem. Two sentiment analysis models were developed; the first, a topic based model, given user provided topic or person of interest sentiment (opinion) analysis was performed on related topic sentences in a tweet stream. The second, an aspect (feature) based model given user provided product of interest and related product features aspect (feature) analysis was performed on reviews containing important feature terms. The experimental results of the proposed framework using real time tweet stream and product reviews show comparable improvements from the results of the existing literature, with 73% accuracy for topic-based sentiment model, and 74% accuracy for aspect (feature) based sentiment model. The work demonstrated that our topic and aspect based sentiment analysis models on the real time stream data processing framework using Apache Spark and machine learning classifiers coupled with a NoSQL big data ecosystem offer an efficient, scalable, real-time stream data-processing alternative for the complex multiphase sentiment analysis over common batch data mining frameworks.

Committee:

Sun Sunnie Chung, Ph.D. (Committee Chair); Yongjigan Fu, Ph.D. (Committee Member); Ifthkar Sikder, Ph.D. (Committee Member)

Subjects:

Computer Science

Keywords:

big data analytics, sentiment analysis, stream data-processing

Fathi Salmi, MeisamProcessing Big Data in Main Memory and on GPU
Master of Science, The Ohio State University, 2016, Computer Science and Engineering
Many large-scale systems were designed with the assumption that I/O is the bottleneck, but this assumption has been challenged in the past decade with new trends in hardware capabilities and workload demands. The computational power of CPU cores has not improved proportional to the performance of disks and network interfaces in the past decade, but the demand for computational power in various workloads has grown out of proportion. GPUs outperform CPUs for various workloads such as query processing and machine learning workloads. When such workloads runs on a single computer, the data processing systems must use GPUs to stay competitive. But GPUs have never been studied for large-scale data analytics systems. To maximize GPUs erformance, core assumptions about the behavior of large-sclale systems should be shaken and the whole systems should be redesigned. In this report, we used Apache Spark as a case to study the performance benefits of using GPUs in a large-scale, distributed, in-memory, data analytics system. Our system, Spark-GPU, exploits the massively parallel processing power of the GPUs in a large-scale, in-memory system and accelerates crucial data analytics workloads. Spark-GPU minimizes memory management overhead, reduces the extraneous garbage collection, minimizes internal and external data transfers, converts data into a GPU-friendly format, and provides batch processing. Spark-GPU detects GPU-friendly tasks based on predefined patterns in computation and automatically schedules them on the available GPUs in the cluster. We have evaluated Spark-GPU with a set of representative data analytics workloads to show its effectiveness. The results show that Spark-GPU significantly accelerates data mining and statistical analysis workloads, but provides limited performance speedup for traditional query processing workloads.

Committee:

Xiaodong Zhang, Dr. (Advisor); Yang Wang, Dr. (Committee Member)

Subjects:

Computer Science; Information Technology; Statistics

Keywords:

big-data; GPU; Spark; Hadoop; data analytics;

Campbell, Cory AThe Changing Landscape of Finance in Higher Education: Bridging the Gap Through Data Analytics
Doctor of Philosophy, Case Western Reserve University, 2018, Management
In the higher education sector, external forces are influencing funding sources, which effect both public and private schools. Institutions need to react to the “new normal” of fiscal landscape. As institutions of higher learning adapt to the changing environment, they must adopt new ways to use information more efficiently in their decision-making. Data analytics could potentially help schools recognize trends, ask what-if questions, and apply predictive models to improve their strategic capability while creating a more sustainable future with better educational services and business models. Yet in the wake of a rapidly changing environment, slow-moving organizations often find it difficult to respond to such new challenges. In the higher education sector, the assimilation to encompass business analytics into decision-making process remains a challenge. Using institutional theory and resource based theory lenses, the dissertation employs a sequential mixed methods approach involving a three-part empirical study to address the research question, how data analytics can influence institutional performance in higher education? The first qualitative study explores the use of data analytics in higher education finance. This study offers evidence of under-performance of the ERP implementation by higher education institutions and considers the effects of organizational culture, the degree of autonomy at the departmental level, and the strong reliance towards the use of various bolt-on or shadow systems to use information from the core ERP system. The second study analyzes the ERP effectiveness and the mediating effects of perceived organizational support. The third study considers the relationship between analytics investment and organizational performance and as mediated by data driven culture and data quality. This research provides empirical support showing that data quality and data-driven culture are key antecedents to use data analytics to improve organizational performance. Understanding the similarities and differences between public and private institutions has implications for the practitioner. The theoretical contribution introduces a proxy for efficiency and organizational performance in the higher education sector into the literature. These findings extend concepts in accounting information systems theories as well as within the higher education literature. Gaining insight as to what are the key levers to improve data quality and efficiencies could pave the way for a more sustainable future in higher education.

Committee:

Timoth Fogarty (Committee Chair); Lyytinen Kalle (Committee Member); Cola Philip (Committee Chair); Richardson Vernon (Committee Chair)

Subjects:

Accounting; Higher Education; Higher Education Administration; Information Systems; Information Technology; Systems Design

Keywords:

Enterprise Resource Planning or ERP systems, data analytics, organizational performance, higher education, resource based theory, institutional theory