Lately, there is a growing interest in the use of cloud computing for scientific applications, including scientific workflows. Key attractions of cloud include the pay-as-you-go model and elasticity. While the elasticity offered by the clouds can be beneficial for many applications and use-scenarios, it also imposes significant challenges in the development of applications or services. For example, no general framework exists that can enable a scientific workflow to execute in a dynamic fashion with QOS (Quality of Service) support, i.e. exploiting elasticity of clouds and automatically allocating and de-allocating resources to meet time and/or cost constraints while providing the desired quality of results the user needs.
This thesis presents a case-study in creating a dynamic cloud workflow implementation with QOS of a scientific application. We work with MassMatrix, an application which searches proteins and peptides from tandem mass spectrometry data. In order to use cloud resources, we first parallelize the search method used in this algorithm. Next, we create a flexible workflow using the Pegasus Workflow Management System from ISI. We then add a new dynamic resource allocation module, which can use fewer or a larger number of resources based on a time constraint specified by the user. Finally we extend this to include the QOS support to provide the user with the desired quality of results. We use the desired quality metric to calculate the values of the application parameters. The desired quality metric refers to the parameters that are computed to maximize the user specified benefit function while meeting the time constraint. We evaluate our implementation using several different data-sets, and show that the application scales quite well. Our implementation effectively allocates resources adaptively and the parameter prediction scheme is successful in choosing parameters that help meet the time constraint.