Search Results (1 - 19 of 19 Results)

Sort By  
Sort Dir
 
Results per page  

Aldakheel, Eman A.A Cloud Computing Framework for Computer Science Education
Master of Science (MS), Bowling Green State University, 2011, Computer Science
With the rapid growth of Cloud Computing, the use of Clouds in educational settings can provide great opportunities for Computer Science students to improve their learning outcomes. In this thesis, we introduce Cloud-Based Education architecture (CBE) as well as Cloud-Based Education for Computer Science (CBE-CS) and propose an automated CBE-CS ecosystem for implementation. This research employs the Cloud as a learning environment for teaching Computer Science courses by removing the locality constraints, while simultaneously improving students' understanding of the material provided through practical experience with the finer details and subjects’ complexities. In addition, this study includes a comparison between Cloud-based virtual classrooms and the traditional e-learning system to highlight the advantages of using Clouds in such a setting. We argue that by deploying Computer Science courses on the Cloud, the institution, administrators, faculty, and the students would gain significant advantages from the new educational setting. The infrastructure buildup, the software updating and licenses managements, the hardware configurations, the infrastructure space, maintenance, and power consumption, and many other issues will be either eliminated or minimized using the Cloud technology. On the other hand, the number of enrolled students is likely to increase since the Cloud will increase the availability of the needed resources for interactive education of larger number of students; it can deliver advanced technology for hands-on training, and can increase the readiness of the students for job market. The CBE-CS approach is more likely to allow faculty to better demonstrate the subjects' complexities to the students by renting he needed facilities whenever it is desired. The research also identified several potential Computer Science courses which could be launched and taught through Clouds. In addition, the selected courses have been classified based on three well-known levels of the Cloud services: Software as a Service (SaaS), Platform as a service (PaaS), and Infrastructure as a Service (IaaS). Subsequently, we propose to build a framework for CSE-CS considering the service layers and the selected courses. The proposed CBE-CS framework is intended to be integrated in a Virtual Classroom Ecosystem for Computer Sciences based on Cloud Computing referred to as VCE-CS. This ecosystem is scalable, available, reliable, and cost effective. Examples from selected pilot courses (i.e., Database, Operating System, Network, and Parallel Programming) are discussed. This research describes VCE-CS and argues for the benefits of such systems.

Committee:

Hassan Rajaei, PhD (Advisor); Guy Zimmerman, PhD (Committee Member); Jong Lee, PhD (Committee Member)

Subjects:

Computer Science

Keywords:

Cloud Computing; Clouds in educational settings; Cloud Computing for Computer Science (CBE-CS); Cloud Computing for Computer students; Cloud Based Education architecture; Cloud Based Education (CBE)

Zhang, DeanData Parallel Application Development and Performance with Azure
Master of Science, The Ohio State University, 2011, Computer Science and Engineering
The Microsoft Windows Azure technology platform provides on-demand, cloud-based computing, where the cloud is a set of interconnected computing resources located in one or more data centers. Recently, Windows Azure platform is available in data centers in most area of the world large cities. MPI is a message passing library standard, which has many participating organizations, including vendors, researchers, software library developers, and users. The goal of the Message Passing Interface is to establish a portable, efficient, and flexible standard for message passing that will be widely used parallel programs. The advantages of developing message passing software using are well known as its portability, efficiency, and flexibility. This thesis is about how to develop MPI like applications on Windows Azure Platform and simulate parallel computing in cloud. The specific goal is to simulate MPI_Reduce and MPI_Allreduce on Windows Azure, and use this simulation to support and build data parallel applications on windows Azure. We also compare the performances of three data parallel applications under three platforms which are traditional clusters, Azure with queue and Azure with WCF.

Committee:

Gagan Agrawal (Advisor); Feng Qin (Committee Member)

Subjects:

Computer Science

Keywords:

MPI Cloud computing

Ranabahu, Ajith HarshanaAbstraction Driven Application and Data Portability in Cloud Computing
Doctor of Philosophy (PhD), Wright State University, 2012, Computer Science and Engineering PhD
Cloud computing has changed the way organizations create, manage, and evolve their applications. While many organizations are eager to use the cloud, tempted by substantial cost savings and convenience, the implications of using clouds are still not well understood. One of the major concerns in cloud adoption is the vendor lock-in of applications, caused by the heterogeneity of the numerous cloud service offerings. Vendor locked applications are difficult, if not impossible to port from one cloud system to another, forcing cloud service consumers to use undesired or suboptimal solutions. This dissertation investigates a complete and comprehensive solution to address the issue of application lock-in in cloud computing. The primary philosophy is the use of carefully defined abstractions in a manner that makes the heterogeneity in the clouds invisible. The first part of this dissertation focuses on the development of cloud applications using abstract specifications. Given the domain specific nature of many cloud workloads, we focused on using Domain Specific Languages (DSLs). We applied DSL based development techniques to two domains with different characteristics and learnt that abstract driven methods are indeed viable and results in significant savings in cost and effort. We also showcase two publicly hosted Web-based application developments tools, pertaining to the two domains. These tools use abstractions in every step of the application life-cycle and allow domain experts to conveniently create applications and deploy them to clouds, irrespective of the target cloud system. The second part of this dissertation presents the use of process abstractions for application deployment and management in clouds. Many cloud service consumers are focused on specific application oriented tasks, thus we provided abstractions for the most useful cloud interactions via a middleware layer. Our middleware system not only provided the independence from the various process differences, but also provided the means to reuse known best practices. The success of this middleware system also influenced a commercial product.

Committee:

Amit Sheth, PhD (Advisor); Krishnaprasad Thirunarayan, PhD (Committee Member); Keke Chen, PhD (Committee Member); Eugene Maximilien, PhD (Committee Member)

Subjects:

Computer Science

Keywords:

Cloud computing; Domain Specific Languages; Program Generation; Program Portability

Snyder, Brett WTools and Techniques for Evaluating the Reliability of Cloud Computing Systems
Master of Science in Engineering, University of Toledo, 2013, College of Engineering
This research introduces a computationally efficient approach taken to evaluate the reliability of a cloud computing system (CCS). The cloud computing paradigm has ushered in the need for the ability to provide computing resources in a highly scalable, flexible, and transparent fashion. The rapid uptake of cloud resource utilization has led to a need for methods that can assess the reliability of a CCS, while aiding the process of expansion and planning. This thesis proposes using reliability assessments likened to those performed on industrial grade power systems to establish methods for evaluating the reliability of a CCS and the corresponding performance metrics. Specifically, non-sequential Monte Carlo Simulation (MCS) is used to evaluate CCS reliability at a system scale. Further contributions are made regarding the design, development, and exploration of standardized test systems, a novel state representation of CCSs, and the use of test systems based on real-world CCSs. Results demonstrate that the method is effective and multiple insights are provided into the nature of CCS reliability and CCS design. A scalable, graphical, web-based piece of cloud simulation software called ReliaCloud-NS is also presented. ReliaCloud-NS is designed with a RESTful API for performing non-sequential MCS to perform reliability evaluations of cloud computing systems. ReliaCloud-NS allows users to design and simulate complex CCSs built from CCS components. Simulation results are stored and presented to the user in the form of interactive charts and graphs from within a Web browser. ReliaCloud-NS contains multiple types of simulations as well as multiple VM allocation schemes. ReliaCloud-NS also contains a novel feature which will evaluate CCS reliability across a range of varying VM allocations and establish and graph a CCS reliability curve. In this thesis the interactive web-based interface, the different types of simulations available, and an overview of the results generated from a simulation are described. The contributions of this thesis lay the foundation for computationally efficient methods that allow for the design and evaluation of highly resilient CCSs. Coupled with the ReliaCloud-NS software, the contributions of this thesis allow for efficient design of complex, yet, reliable CCS systems that can be simulated and analyzed, leading to improved customer experience and cost-savings.

Committee:

Robert Green, Ph.D. (Committee Chair); Vijay Devabhaktuni, Ph.D. (Committee Member); Mansoor Alam, Ph.D. (Committee Member); Hong Wang, Ph.D. (Committee Member)

Subjects:

Computer Engineering; Computer Science; Electrical Engineering

Keywords:

cloud computing; reliability; availability; Monte Carlo simulation; modeling; software

Richards, CraigDevelopment of Cyber-Technology Information for Remotely Accessing Chemistry Instrumentation
Master of Computing and Information Systems, Youngstown State University, 2011, Department of Computer Science and Information Systems

There exists a wide variety of technologies which allow for remote desktop access, data transfer, encryption, and worldwide communication through the Internet. These technologies, while independently solving unique problems, can be combined into a project which would resolve all of the unique problems with one single system. Youngstown State University's Chemistry Department required a high reliability unified system to provide remote access, web cam feeds, user security, and encrypted file transfer for computer equipment operating scientific instrumentation. A suitable software project solution was developed at Youngstown State University in collaboration with Zethus Software through analysis of technological resources and project requirements, and a process of software development.

This thesis describes the cumulus::CyberLab project developed in order to resolve the above requirements. The cumulus::CyberLab project allows students, faculty, and scientists to remotely access millions of dollars of scientific equipment offered by our university from anywhere in the world. To best describe this project, this thesis outlines the overview of the project, work in the project, and how this project created unique software which is valuable to not only our university but also to other worldwide users.

Committee:

Graciela Perera, PhD (Advisor); Allen Hunter, PhD (Committee Member); John Sullins, PhD (Committee Member)

Subjects:

Biology; Chemistry; Communication; Computer Science

Keywords:

Remote access; Scientific instrumentation; Cloud computing; Secure file storage

Jayapandian, Catherine PraveenaCloudwave: A Cloud Computing Framework for Multimodal Electrophysiological Big Data
Doctor of Philosophy, Case Western Reserve University, 2014, EECS - Computer and Information Sciences
Multimodal electrophysiological data, such as electroencephalography (EEG) and electrocardiography (ECG), are central to effective patient care and clinical research in many disease domains (e.g., epilepsy, sleep medicine, and cardiovascular medicine). Electrophysiological data is an example of clinical 'big data' characterized by volume (in the order of terabytes (TB) of data generated every year), velocity (gigabytes (GB) of data per month per facility) and variety (about 20-200 multimodal parameters per study), referred to as '3Vs of Big Data.' Current approaches for storing and analyzing signal data using desktop machines and conventional file formats are inadequate to meet the challenges in the growing volume of data and the need for supporting multi-center collaborative studies with real-time and interactive access. This dissertation introduces a web-based electrophysiological data management framework called Cloudwave using a highly scalable open-source cloud computing approach and hierarchical data format. Cloudwave has been developed as a part of the National Institute of Neurological Disorders and Strokes (NINDS) funded multi-center project called Prevention and Risk Identification of SUDEP Mortality (PRISM). The key contributions of this dissertation are: 1. An expressive data representation format called Cloudwave Signal Format (CSF) suitable for data-interchange in cloud-based web applications; 2. Cloud based storage of CSF files processed from EDF using Hadoop MapReduce and HDFS; 3. Web interface for visualization of multimodal electrophysiological data in CSF; and 4. Computational processing of ECG signals using Hadoop MapReduce for measuring cardiac functions. Comparative evaluations of Cloudwave with traditional desktop approaches demonstrate one order of magnitude improvement in performance over 77GB of patient data for storage, one order of magnitude improvement to compute cardiac measures for signal-channel ECG data, and 20 times improvement for four-channel ECG data using a 6-node cluster in local cloud. Therefore, our Cloudwave approach helps addressing the challenges in the management, access and utilization of an important type of multimodal big data in biomedicine.

Committee:

Guo-Qiang Zhang, PhD (Committee Chair); Satya Sahoo, PhD (Committee Member); Xiang Zhang, PhD (Committee Member); Samden Lhatoo, MD, FRCP (Committee Member)

Subjects:

Bioinformatics; Biomedical Research; Computer Science; Neurosciences

Keywords:

Big Data; Data management; Cloud computing; Electrophysiology; Web application; Ontology; Signal analysis

Gera, AmitProvisioning for Cloud Computing
Master of Science, The Ohio State University, 2011, Industrial and Systems Engineering
The paradigm of cloud computing has started a new era of service computing. While there are many research efforts on developing enabling technologies for cloud computing, few focuses on how to strategically set price and capacity and what key components are leading to success in this emerging market. In this thesis, we present quantitative modeling and optimization approaches for assisting such decisions in cloud computing services. We first show that learning curve models help in understanding the potential market of cloud services and explain quantitatively why cloud computing is most attractive to small and medium businesses. We then present Single Instance model to depict a particular type of cloud networks and aid in resource provisioning for the cloud service providers. We further present Multiple Instance model to depict any generic cloud network. We map the resource provisioning problem to Kelly's Loss Network and propose Genetic Algorithm to solve it. The approach provides the cloud service provider a quantitative framework to obtain management solutions and to learn and react to the critical parameters in the operation management process by gaining useful business insights.

Committee:

Dr. Cathy Xia (Advisor); Dr. Theodore Allen (Committee Member)

Subjects:

Operations Research

Keywords:

Cloud Computing; Learning Curves; Stochastic Modeling; Pricing; Provisioning

Chakraborty, SuryadipData Aggregation in Healthcare Applications and BIGDATA set in a FOG based Cloud System
PhD, University of Cincinnati, 2016, Engineering and Applied Science: Computer Science and Engineering
The Wireless Body Area Sensor Network (WBASN) is a wireless network of wearable computing devices including few medical body sensors which capture and transmit different physiological data wirelessly to a monitoring base station. When a physiological sensor continuously senses and generates huge amount of data, the network might become congested due to heavy traffic and it might lead to starvation and ineffectiveness of the WBASN system. This had led to the beginning of our first problem in this research which is the use of aggregation of data so as to reduce the traffic, enhancing the network life time, and saving the network energy. This research also focuses on dealing with huge amount of healthcare data which is widely known today as `BIGDATA’. Our research investigates the use of BIGDATA and ways to analyze them using a cloud based architecture that we have proposed as FOG Networks which improves the use of cloud architecture. During the work of data aggregation, we propose to use of the statistical regression polynomial of the order 4, and 8. Due to computation, we performed the 6th order coefficient computation and analyzed our results with real-time patient data with compression ratio and correlation coefficients. We also focus on studying the energy saving scenarios using our method and investigate how the node failure scheme would be handled. While focusing on building a polynomial based data aggregation approach in the WBASN system which involves summing and aggregating of wireless body sensors data of the patient's, we noticed the problem of dealing with thousand and millions of patients data when we run a WBASN system for continuous monitoring purpose. We could not also deal with such big amount of data in the small storage of the physiological sensors with small computation abilities of them. So, there is an immediate necessity of an architecture and tools to deal with these thousands of data commonly known today as the BIGDATA. To analyze the BIGDATA, we propose to implement a robust cloud-based structure that uses Hadoop based map reduce system and get some meaningful interpretation of the patient's monitoring data for the medical practitioners, doctors and medical representatives in a very time-efficient manner. As getting thousands of BIGDATA with patient’s secured health information is a proprietary and licensed issue, we examined our cloud based BIGDATA architecture using the Twitter and Google N-gram data which are freely available in public domain. In our next proposed task, we plan to implement a robust and scalable architecture of the existing cloud system which itself takes care of the short comings of the public cloud architecture such as Amazon S3, Microsoft Azure etc. Therefore, we propose to use of a newly introduced system known as the FOG networks that significantly helps the clients (medical workers monitoring the patient’s vital parameters) to easily assess, interpret and analyze the patient’s data of injuries, health parameter performance, and improvement in the health condition, associated vital parameters and emergency data arise very efficiently and more effectively.

Committee:

Dharma Agrawal, D.Sc. (Committee Chair); Amit Bhattacharya, Ph.D. (Committee Member); Rui Dai, Ph.D. (Committee Member); Chia Han, Ph.D. (Committee Member); Carla Purdy, Ph.D. (Committee Member)

Subjects:

Computer Science

Keywords:

Wireless body area sensor networks;Data aggregation;Cloud computing;Fog computing

Deng, NanSystems Support for Carbon-Aware Cloud Applications
Doctor of Philosophy, The Ohio State University, 2015, Computer Science and Engineering
Datacenters, which are large server farms, host cloud applications, providing services ranging from search engines to social networks and video streaming services. Such applications may belong to the same owner of the datacenter or from third party developers. Due to the growth of cloud applications, datacenters account for a larger fraction of worldwide carbon emissions each year. To reduce the carbon emissions, many datacenter owners are slowly but gradually adopting clean, renewable energy, like solar or wind energy. To encourage datacenter owners to invest into renewable energy, the usage of renewable energy should lead to profit. However, in most cases, renewable energy supply is intermittent and may be limited. Such fact makes renewable energy more expensive than traditional dirty energy. On the other hand, not all customers have the need of using renewable energy for their applications. Our approach is to devise accountable and effective mechanisms to deliver renewable energy only to users that will pay for renewable-powered services. According to our research, datacenter owners could make profit if they could concentrate the renewable energy supply to carbon-aware applications, who prefer cloud resources powered by renewable energy. We develope two carbon-aware applications as use cases. We conclude that if an application take carbon emissions as a constraint, it will end up with using more resources from renewable powered datacenters. Such observation helps datacenter owners to wisely distribute renewable energy within their systems. Our first attempt of concentrating renewable energy focuses on architectural level. Our approach requires datacenters have on-site renewable energy generator using grid ties to integrate renewable energy into their power supply system. To measure the concentration of renewable energy, we introduce a new metric, the renewable-powered instance. Using this metric, we found that grid-tie placement has first-order effects on renewable-energy concentration. On-site renewable energy requires an initial investment to install renewable generator. Although this cost could be gradually amortized over time, some people prefer renewable energy credit, which could be bought from utility companies by paying premium for the renewable energy transmitted through the grid and produced in other locations. To let datacenters, with or without on-site renewable energy generator, attract more carbon-aware customers, we designed a system for Adaptive Green Hosting. It identifies carbon-aware customers by signaling customers’ applications when renewable energy is available and observing their behaviors. Since carbon-aware applications would tend to use more resources in a datacenter with low emission rates, datacenter owners could make profit by attributing more renewable energy to carbon-aware applications, so that could encourage them to use more resources. Our experiments show that adaptive green hosting can increase profit by 152% for one of todays larger green hosts. Although it is possible for cloud applications to maintain a low carbon footprint while make profit, most existing applications are not carbon-aware. The carbon footprint for most existing workloads is large. Without forcing them to switch to renewable energy, we believe responsible end users could take a step forward first. We propose a method to help end users to discover implementation-level details about a cloud application by extracting its internal software delays. Such details are unlikely to be exposed to third-party users. Instead, our approach probes target application from outside, and extract normalized software delay distributions using only response times. Such software delay distributions are not only useful to reveal normalized energy footprint of an application, but could also be used to diagnose root causes of tail response times for live applications.

Committee:

Christopher Stewart, Dr. (Advisor); Xiaorui Wang, Dr. (Committee Member); Gagan Agrawal, Dr. (Committee Member)

Subjects:

Computer Engineering; Computer Science

Keywords:

Datacenter; Renewable Energy; Performance Analysis; Black-box Analysis; Cloud Computing

Bicer, TekinSupporting Fault Tolerance and Dynamic Load Balancing in FREERIDE-G
Master of Science, The Ohio State University, 2010, Computer Science and Engineering

Over the last 2-3 years, the importance of data-intensive computing has increasingly been recognized, closely coupled with the emergence and popularity of map-reduce for developing this class of applications. Besides programmability and ease of parallelization, fault tolerance and load balancing are clearly important for data-intensive applications, because of their long running nature, and because of the potential for using a large number of nodes for processing massive amounts of data.

Two important goals in supporting fault-tolerance are low overheads and efficient recovery. With these goals, our work proposes a different approach for enabling data-intensive computing with fault-tolerance. Our approach is based on an API for developing data-intensive computations that is a variation of map-reduce, and it involves an explicit programmer-declared reduction object. We show how more efficient fault-tolerance support can be developed using this API. Particularly, as the reduction object represents the state of the computation on a node, we can periodically cache the reduction object from every node at another location and use it to support failure-recovery.

Supporting dynamic load balancing in heterogeneous environments, such as grids, is another crucial problem. Successfully distributing the tasks among the processing units according to their processing capabilities, and during this process minimizing the overhead of the system are significant to the overall execution. Our approach is based on the independent data elements which can be processed by any processing resource in the system. We evaluated and showed how effectively our dynamic load balancing system can perform with our data-intensive computing API. Specifically, the problem space that our API is working on enables us to process data elements independently. This property can be exploited, and any data element in the system can be assigned to any computational resource during the execution.

We have extensively evaluated our approaches using two data-intensive applications. Our results show that the overheads of our schemes are extremely low. Furthermore, we compared our fault tolerance approach with one of the Map-Reduce implementations, Hadoop. Our fault tolerance system outperforms Hadoop both in absence and presence of failures.

Committee:

Gagan Agrawal, Prof. (Advisor); Ponnuswamy Sadayappan, Prof. (Committee Member)

Subjects:

Computer Science; Information Systems; Systems Design

Keywords:

Fault tolerance; Load balancing; Data-intensive computing; Cloud computing; Map-reduce

Yang, DaiyiZoolander: Modeling and managing replication for predictability
Master of Science, The Ohio State University, 2011, Computer Science and Engineering

Social networking and scientific computing workloads access networked storage much more frequently than traditional static-content workloads. These workloads improve their speed by issuing requests parallel, which offers a lot speedup if its slowest parallel access is fast but will also suffers some unexpected slowdowns if the former assumption cannot be guaranteed. In this paper, we study replication for predictability, which speeds up the slow storage accesses by running the same workload on duplicate nodes and using the fastest response, different from traditional replication for throughput.

Based on the mechanism of replication for predictability, we propose Zoolander, an analytical model predicting the percentage of quickly completed access, i.e. SLA. Zoolander combines factors of the replication strategies, the distribution of heavy tail access and the queuing delay, to output the most efficient solution. Then we created an enhanced Zookeeper like coordination service supporting replication for predictability. Zoolander was precise which could achieve SLA of 0.002 absolute errors over diverse workloads. Using Zoolander, we achieved speedups of 375% and reduced the cloud servers needed by 50%.

Committee:

Christopher Stewart, Dr. (Advisor); Gagan Agrawal, Dr. (Committee Member)

Subjects:

Computer Science

Keywords:

Cloud Computing; Data-intensive Workload; Replication

Yang, ShanhuAn Adaptive Prognostic Methodology and System Framework for Engineering Systems under Dynamic Working Regimes
PhD, University of Cincinnati, 2016, Engineering and Applied Science: Mechanical Engineering
Prognostics and Health Management (PHM) as a research discipline focuses on assessing degradation behavior and predicting time to failure of an engineering system with condition monitoring data collected throughout the lifespan of the system. The information of predicted Remaining Useful Life (RUL) and potential failure modes further enables just-in-time maintenance, reduced operational cost and optimized production. In recent years with the development of information systems such as cloud computing and Internet of Things (IOT), machine data from factory floors can be collected more conveniently with higher speed, volume and variety, which brings about new opportunities and much wider application of PHM technologies. On the other hand, the emerging industrial big data with real world complications also imposes greater challenges to the PHM research community. Data collected from a large amount of machine units under dynamic working regimes requires algorithms to adaptively and autonomously recognize and handle different situations. Autonomous PHM algorithms can further be implemented in centralized computing platforms for more efficient, faster and large scale data mining and analytics, which will eventually lead to more effective handling and exploitation of industrial big data. PHM algorithms have been developed based on specific applications and datasets. In addition, most of PHM tools are developed based on limited working regimes. In reality, many engineered machinery and systems often work under different dynamic working regimes and as a consequence it is always a challenge to implement PHM in such conditions. This dissertation work presents the development of a systematically designed and implementation-ready methodology for adaptive health assessment and prognostics for real world machine fleets that undergo dynamic working regimes and other complications. Due to limitations in data and knowledge for in-field systems, the approach assumes no prior knowledge or available training data and attempts to extract degradation information only from condition monitoring data streamed in real time. The approach contains a generalized state space model for machine degradation and an adaptive and online methodology for real time degradation assessment and prediction. The degradation model is a generalized yet comprehensive description of the relationships among the three key aspects in the PHM related research, which are system degradation, system measurements and working regimes. The online methodology further consists of an adaptive segmentation method for identification of health stages based on local variation, a variable selection algorithm for selecting related working regime parameters and an Adaptive Kalman Filter (AKF) based online filtering method for model identification and prediction. The methodology is demonstrated and validated using both simulated data and data from real world industrial applications. The case studies show that the proposed approach is able to deliver robust and accurate results with little algorithm tuning needed for different applications, which is ideal for facilitating automated data processing and analytics in online PHM platforms.

Committee:

Jay Lee, Ph.D. (Committee Chair); Rajkumar Roy, Ph.D. (Committee Member); Thomas Richard Huston, Ph.D. (Committee Member); J. Kim, Ph.D. (Committee Member); David Thompson, Ph.D. (Committee Member)

Subjects:

Engineering, Mechanical; Mechanics

Keywords:

Prognostics and Health Management;Adaptive Learning;Accelerated Life Testing;Cloud Computing

Hangwei, QianDynamic Resource Management of Cloud-Hosted Internet Applications
Doctor of Philosophy, Case Western Reserve University, 2012, EECS - Computer and Information Sciences

Internet is evolving toward service-oriented computing platforms (e.g., cloud computing platforms, such as Amazon EC2 and Microsoft Azure). In these platforms, service providers (owners of the platforms) offer resource pools by building multiple geo-distributed data centers; application providers (owners of the applications) outsource the hosting of their applications to these platforms, and pay by the amount of resources used as utility. These multi-tenant platforms need to dynamically allocate resources to applications so as to meet their demand variation.

In this thesis, we address several issues of the dynamic resource management in these platforms. On the one hand, we consider the resource provisioning problems within data centers. In order to allocate resources to applications quickly, we propose deploying ghost virtual machines (VMs) which host spare application instances across the physical machines. When an application needs more instances, we can configure the request distributer to forward requests to ghost VMs, which takes only 5-7 seconds. Also, to deal with the scalability issues in mega data center (with hundreds of thousands of servers), we introduce hierarchical resource management scheme in which servers are divided into groups (pods), each with about 5k servers, and existing techniques are employed to manage resources in each pod efficiently. Meanwhile, multiple strategies are explored to balance the load among the pods. In addition, we also propose a new data center architecture in which we can apply DNS-based mechanism to balance the load among the access links which connect data center to Internet.

On the other hand, we address the resource management problems among multiple data centers. We proposed a unified approach to decide in how many/which data centers each application should be deployed, and how client requests are forwarded to the geo-distributed service replicas. We make these decisions based on a min-cost network flow model, and apply a novel demand clustering technique to overcome the scalability issue when solving the min-cost problem. Furthermore, we also introduce a new client-side DNS architecture which brings local DNS server close to clients so that DNS-based server selection can precisely choose close service replicas for clients.

Committee:

Michael Rabinovich, PhD (Committee Chair); Vincenzo Liberatore, PhD (Committee Member); Guo-Qiang Zhang, PhD (Committee Member); Christos Papachristou, PhD (Committee Member)

Subjects:

Computer Science

Keywords:

cloud computing; resource management; agility; load balance; data center; scalability; application placement; server selection; clustering; local DNS; peer to peer

Manjunatha, Ashwin KumarA Domain Specific Language Based Approach for Developing Complex Cloud Computing Applications
Master of Science in Computer Engineering (MSCE), Wright State University, 2011, Computer Engineering

Computing has changed. Lately, a slew of cheap, ubiquitous, connected mobile devices as well as seemingly unlimited, utility style, pay as you go computing resources has become available at the disposal of the common man. The latter commonly called Cloud Computing (or just Cloud) is democratizing computing by making large computing power accessible to people and corporations around the world easily and economically.

However, taking full advantage of this computing landscape, especially for the data intensive domains, has been hampered by many factors, the primary one being the complexity in developing applications for the variety of available platforms.

This thesis attempts to alleviate many of the issues faced in developing complex Cloud centric applications by using a Domain Specific Language (DSL) based methods. The research is focused in two main areas. One area is hybrid applications with mobile device based front-ends and Cloud based back-ends. The other is data and compute intensive biological experiments, exemplified by applying a DSL to metabolomics data analysis. This research investigates the viability of using a DSL in each domain and provides evidence of successful application.

Committee:

Amit Sheth, PhD (Advisor); Krishnaprasad Thirunarayan, PhD (Committee Member); Paul Anderson, PhD (Committee Member); Ramakanth Kavuluru, PhD (Committee Member)

Subjects:

Computer Engineering; Computer Science

Keywords:

Cloud Computing; Mobile Computing; Domain Specific Language; DSL; Cloud Mobile Hybrid Application; Metabolomics; Mobicloud; Mobicloud Toolkit; mobi-cloud; Metabolink; SCALE toolkit

Bicer, TekinSupporting Data-Intensive Scienti c Computing on Bandwidth and Space Constrained Environments
Doctor of Philosophy, The Ohio State University, 2014, Computer Science and Engineering
Scientific applications, simulations and instruments generate massive amount of data. This data does not only contribute to the already existing scientific areas, but it also leads to new sciences. However, management of this large-scale data and its analysis are both challenging processes. In this context, we require tools, methods and technologies such as reduction-based processing structures, cloud computing and storage, and efficient parallel compression methods. In this dissertation, we first focus on parallel and scalable processing of data stored in S3, a cloud storage resource, using compute instances in Amazon Web Services (AWS). We develop MATE-EC2 which allows specification of data processing using a variant of Map-Reduce paradigm. We show various optimizations, including data organization, job scheduling, and data retrieval strategies, that can be leveraged based on the performance characteristics of cloud storage resources. Furthermore, we investigate the efficiency of our middleware in both homogeneous and heterogeneous environments. Next, we improve our middleware so that users can perform transparent processing on data that is distributed among local and cloud resources. With this work, we maximize the utilization of geographically distributed resources. We evaluate our system's overhead, scalability, and performance with varying data distributions. The users of data-intensive applications have different requirements on hybrid cloud settings. Two of the most important ones are execution time of the application and resulting cost on the cloud. Our third contribution is providing a time and cost model for data-intensive applications that run on hybrid cloud environments. The proposed model lets our middleware adapt performance changes and dynamically allocate necessary resources from its environments. Therefore, applications can meet user specified constraints. Fourth, we investigate compression approaches for scientific datasets and build a compression system. The proposed system focuses on implementation and application of domain specific compression algorithms. We port our compression system into aforementioned middleware and implement different compression algorithms. Our framework enables our middleware to maximize bandwidth utilization of data-intensive applications while minimizing storage requirements. Although, compression can help us to minimize input and output overhead of data-intensive applications, utilization of compression during parallel operations is not trivial. Specifically, unable to determine compressed data chunk sizes in advance complicates the parallel write operations. In our final work, we develop different methods for enabling compression during parallel input and output operations. Then, we port our proposed methods into PnetCDF, a widely used scientific data management library, and show how transparent compression can be supported during parallel output operations. The proposed system lets an existing parallel simulation program start outputting and storing data in a compressed fashion. Similarly, data analysis applications can transparently access to compressed data using our system.

Committee:

Gagan Agrawal (Advisor); Feng Qin (Committee Member); Spyros Blanas (Committee Member)

Subjects:

Computer Science

Keywords:

Data-Intensive Computing; Map-Reduce; Cloud Computing; Big Data; Scientific Data Management; Compression

Singh, AbhishekMobile Crowd Instrumentation: Design of Surface Solar Irradiance Instrument
MS, Kent State University, 2017, College of Arts and Sciences / Department of Computer Science
This thesis explores mobile crowd instrumentation a distinct system of data acquisition and mining with the support of crowd and instrument (Smartphone). As around from many decades, traditional ways of gathering data were either using physical hardware input or from human interaction. With compounding both the techniques, emerges an approach called crowdsourcing, where humans and machine work collectively produce large sums of the analytical data. As recent surge in popularity of smartphones equipped with inertial sensors (Such as gyroscope, infrared, ambient light sensors etc.), which help to collect data without using any external hardware input. Thus, it gives more scope and extensibility of data gathering from an available crowd(humans) not just from aimed or some specialized crowd. We have developed a system with mobile crowd instrumentation to map Surface Solar Irradiance i.e. amount of solar energy radiates on one per square meter of area. The system works collectively using target smartphone sensors data such as camera, gyroscope, accelerometer, GPS, GPRS and mobile clock for computational analysis using mobile processing engine and estimate Surface Solar Irradiance. The crowd addresses this data using crowd mobile application to a cloud server. The proposed mobile crowd instrumentation system architecture, which incorporates using different cloud server, web services and API’s helps to interpret data gathered from a mobile crowd and point to results in mapping Surface Solar Irradiance for one specific region to worldwide region. Conjointly, crowd input and cloud computing can follow into many complex crowd instruments which can be mapped into different Data source for e.g. mapping worldwide electromagnetic field or map any type of Data collection such as Audio files (e.g. specific sounds or vocals), video files (e.g. videos of specific events), still Images (e.g. Images of event or kind) or Human inputs from a specific region to worldwide. As this instrumentation brings more extensibility for data gathering and analysis to researchers without providing any external source. Also, if the complexity of Data acquisition transform, which can also bring more constraints and challenges. As entire architecture relies on smartphone sensors which can be limited and a crowd, where crowd interaction may produce Human errors. Albeit with scope and extensibility of this system can leads to design some Remarkably Complex mobile crowd Instruments.

Committee:

Javed Khan, Dr. (Advisor); Austin Melton, Dr. (Committee Member); Xiang Lian, Dr. (Committee Member)

Subjects:

Computer Science

Keywords:

Surface Solar Irradiance, Crowdsourcing, Mobile sensors, Mobile computing, Cloud computing, Mobile instrumentation

Zhu, JiedanAn Autonomic Framework Supporting Task Consolidation and Migration in the Cloud Environment
Master of Science, The Ohio State University, 2011, Computer Science and Engineering
Cloud Computing systems provide a variety of storage and computation resources. One advantage is the pay-as-you-go model, where users only pay the fee for the amount of resource they have used. There could be some user-specific concerns such as a time constraint and a cost budget. However, without resource scheduling and management in the Cloud environment, virtual instances could be under utilized and users may pay more than expected, which might not satisfy the user requirements. Cloud service providers hide the control of physical resources from users. In this thesis, we designed an autonomic framework, which supports task consolidation and light-weighted migration over the virtual resources in the Cloud environment. We focus on DAG-based workflows and the user constraints are time constraint and cost budget. Our goal is to keep the application to complete within the time constraint while the cost is within the cost budget. We have developed three techniques with different kinds of prior knowledge, the CPU and memory requirements of tasks, iteration structures of workflows and iteration structures of tasks. We show that our system is effective and can save the cost up to 66% compared with the case when there is no resource scheduling. In addition, we compared system performance with three techniques and we found that with the CPU and memory requirements of tasks as the prior knowledge, our system has a better performance-price ratio than the other two prior knowledge.

Committee:

Gagan Agrawal, PhD (Advisor); Christopher Stewart, PhD (Committee Member)

Subjects:

Computer Science

Keywords:

Cloud Computing;migration;framework;

Al-Olimat, Hussein SOptimizing Cloudlet Scheduling and Wireless Sensor Localization using Computational Intelligence Techniques
Master of Science, University of Toledo, 2014, Engineering (Computer Science)
Optimization algorithms are truly complex procedures that consider many elements when optimizing a specific problem. Cloud computing (CCom) and Wireless sensor networks (WSNs) are full of optimization problems that need to be solved. One of the main problems of using the clouds is the underutilization of the reserved resources, which causes longer makespans and higher usage costs. Also, the optimization of sensor nodes' power consumption, in WSNs, is very critical due to the fact that sensor nodes are small in size and have constrained resources in terms of power/energy, connectivity, and computational power. This thesis formulates the concern on how CCom systems and WSNs can take advantage of the computational intelligent techniques using single- or multi-objective particle swarm optimization (SOPSO or MOPSO), with an overall aim of concurrently minimizing makespans, localization time, energy consumption during localization, and maximizing the number of nodes fully localized. The cloudlet scheduling method is implemented inside CloudSim advancing the work of the broker, which was able to maximize the resource utilization and minimize the makespan demonstrating improvements of 58\% in some cases. Additionally, the localization method optimized the power consumption during a Trilateration-based localization (TBL) procedure, through the adjustment of sensor nodes' output power levels. Finally, a parameter-study of the applied PSO variants for WSN localization is performed, leading to results that show algorithmic improvements of up to 32\% better than the baseline results in the evaluated objectives.

Committee:

Mansoor Alam (Committee Chair); Robert Green, II (Committee Co-Chair); Weiqing Sun (Committee Member); Vijay Devabhaktuni (Committee Member)

Subjects:

Artificial Intelligence; Computer Science; Engineering

Keywords:

Cloud Computing; Particle Swarm Optimization; Random Inertia Weight; Cloudlet Scheduling; Makespan; Utilization; CloudSim; Wireless Sensor Network; Trilateration; Localization; Multi-objective; ZigBee; RSSI; Genetic Algorithm; Simulated Annealing

Jamaliannasrabadi, SabaHigh Performance Computing as a Service in the Cloud Using Software-Defined Networking
Master of Science (MS), Bowling Green State University, 2015, Computer Science
Benefits of Cloud Computing (CC) such as scalability, reliability, and resource pooling have attracted scientists to deploy their High Performance Computing (HPC) applications on the Cloud. Nevertheless, HPC applications can face serious challenges on the cloud that could undermine the gained benefit, if care is not taken. This thesis targets to address the shortcomings of the Cloud for the HPC applications through a platform called HPC as a Service (HPCaaS). Further, a novel scheme is introduced to improve the performance of HPC task scheduling on the Cloud using the emerging technology of Software-Defined Networking (SDN). The research introduces “ASETS: A SDN-Empowered Task Scheduling System” as an elastic platform for scheduling HPC tasks on the cloud. In addition, a novel algorithm called SETSA is developed as part of the ASETS architecture to manage the scheduling task of the HPCaaS platform. The platform monitors the network bandwidths to take advantage of the changes when submitting tasks to the virtual machines. The experiments and benchmarking of HPC applications on the Cloud identified the virtualization overhead, cloud networking, and cloud multi-tenancy as the primary shortcomings of the cloud for HPC applications. A private Cloud Test Bed (CTB) was set up to evaluate the capabilities of ASETS and SETSA in addressing such problems. Subsequently, Amazon AWS public cloud was used to assess the scalability of the proposed systems. The obtained results of ASETS and SETSA on both private and public cloud indicate significant performance improvement of HPC applications can be achieved. Furthermore, the results suggest that proposed system is beneficial both to the cloud service providers and the users since ASETS performs better the degree of multi-tenancy increases. The thesis also proposes SETSAW (SETSA Window) as an improved version of SETSA algorism. Unlike other proposed solutions for HPCaaS which have either optimized the cloud to make it more HPC-friendly, or required adjusting HPC applications to make them more cloud-friendly, ASETS tends to provide a platform for existing cloud systems to improve the performance of HPC applications.

Committee:

Hassan Rajaei, Ph.D (Advisor); Robert Green, Ph.D (Committee Member); Jong Kwan Lee, Ph.D (Committee Member)

Subjects:

Computer Engineering; Computer Science; Technology

Keywords:

High Performance Computing; HPC; Cloud Computing; Scientific Computing; HPCaaS; Software Defined Networking; SDN; Cloud Networking; Virtualization