Skip to Main Content

Basic Search

Skip to Search Results
 
 
 

Left Column

Filters

Right Column

Search Results

Search Results

(Total results 11)

Mini-Tools

 
 

Search Report

  • 1. Alnaeli, Saleh EMPIRICALLY EXAMINING THE ROADBLOCKS TO THE AUTOMATIC PARALLELIZATION AND ANALYSIS OF OPEN SOURCE SOFTWARE SYSTEMS

    PHD, Kent State University, 2015, College of Arts and Sciences / Department of Computer Science

    Modern multicore architectures have become ubiquitous and are present in almost all of today's computers and mobile devices. That is, the hardware resources are available to parallelize all types of general-purpose software applications. This fact has pushed the need for software engineers to rethink how the code they write can better utilize the underlying hardware. Because most of existing software systems were developed with sequential processors in mind, they typically make inefficient use of the multicore technology (by using only one core in many cases). Parallelizing existing software systems can be a very time consuming and a risky task because of the expected errors and bugs that can be introduced if a manual approach is considered. The dissertation explores issues related to the analysis of large-scale general-purpose software systems developed in C/C++ and if it is practical and warranted to parallelize such systems. A series of empirical studies is conducted to examine of a variety of general-purpose open source software systems to better understand the roadblocks for applying automated and/or semi-automated parallelization tools. The primary contributions presented in this dissertation are, broadly, the study and description of inhibitors to automated parallelization and demonstrate which inhibitors are most prevalent. That is, the main interest is determining the most prevalent inhibitors that occur in a wide variety of software applications and if there are general trends. Additionally, the development of new source code analysis techniques and tools for analyzing large-scale software repositories are presented. Empirical studies of the historical change, over the lifetime of the software systems, in the number of inhibitors are also conducted. Empirical analysis of the prevalence and usage of function calls that involve function pointers or virtual methods is also conducted as this can greatly increase the computational cost of perfor (open full item for complete abstract)

    Committee: Jonathan Maletic Dr. (Advisor); Mikhail Nesterenko Dr. (Committee Member); Ye Zhao Dr. (Committee Member); Joseph Ortiz Dr. (Committee Member); Michael Collard Dr. (Committee Member); John Portman Dr. (Committee Member) Subjects: Computer Science
  • 2. Scyphers, Madeline Bayesian Optimization for Anything (BOA): An Open-Source Framework for Accessible, User-Friendly Bayesian Optimization

    Master of Science, The Ohio State University, 2024, Environmental Science

    I introduce Bayesian Optimization for Anything (BOA), a high-level BO framework and model wrapping toolkit, which presents a novel approach to simplifying Bayesian Optimization (BO) with the goal of making it more accessible and user-friendly, particularly for those with limited expertise in the field. BOA addresses common barriers in implementing BO, focusing on ease of use, reducing the need for deep domain knowledge, and cutting down on extensive coding requirements. A notable feature of BOA is its language-agnostic architecture. Using JSON serialization, BOA facilitates communication between different programming languages, enabling a wide range of users to integrate BOA with their existing models, regardless of the programming language used, with a simple and easy-to-use interface. This feature enhances the applicability of BOA, allowing for broader application in various fields and to a wider audience. I highlight BOA's application through several real-world examples. BOA has been successfully employed in a high-dimensional (184 parameters) optimization Soil & Water Assessment Tool (SWAT+) model, demonstrating its capability in parallel optimization with SWAT and non-parallel models, such as SWAT+. I employed BOA in a multi-objective optimization of the FETCH3.14 model. These case studies illustrate BOA's effectiveness in addressing complex optimization challenges in diverse scenarios.

    Committee: Gil Bohrer (Advisor); James Stagge (Committee Member); Joel Paulson (Committee Member) Subjects: Artificial Intelligence; Computer Science; Environmental Engineering; Environmental Studies; Statistics
  • 3. Kang, Younghun Development of computational techniques for coupled 1D/2D hydrodynamic models

    Doctor of Philosophy, The Ohio State University, 2023, Civil Engineering

    Multidimensional (coupled one-dimensional (1D) and two-dimensional (2D)) hydrodynamic models are developed to achieve computational efficiency for study areas with small-scale channel networks. Fine-scale computational domains are required to adequately resolve geometry of such study areas with typical 2D hydrodynamic models, which results in high computational cost. Coupled 1D/2D hydrodynamic models, which use 1D models for small-scale areas (typically small-scale channels), allow preserving geometric features of the study area with moderate computational cost and have been applied in various numerical studies. In this dissertation, we present computational techniques that further enhance coupled 1D/2D hydrodynamic models. The first one is an automatic mesh generation tool for coupled 1D/2D hydrodynamic models. Meshes are a required input for hydrodynamic models, and automatic mesh generation tools for 2D hydrodynamic models are well developed. However, development of such tools becomes challenging when they are designed for coupled 1D/2D hydrodynamic models. The difficulty of mesh generation in this case comes from the fact that the resolutions of the 1D/2D domains are closely intertwined with each other; however, the desired mesh resolutions for each domain may be quite different. The proposed mesh generator provides features to automatically identify 1D domains from given input data and to generates collocated meshes with efficient sizing along 1D domains. The developed techniques are demonstrated on three test cases, including two inland watersheds and a coastal basin. Second, a new smoothing method for digital elevation models (DEMs) is developed to enhance the application of an existing coupled 1D/2D kinematic wave model based on discontinuous Galerkin (DG) methods. The model has shown great success in rainfall-runoff simulations; however, it is highly sensitive to the topography represented by the mesh. The proposed method is compared to straightforwar (open full item for complete abstract)

    Committee: Ethan J. Kubatko Dr. (Advisor); James H. Stagge Dr. (Committee Member); Yulong Xing Dr. (Committee Member); Ryan Winston Dr. (Committee Member) Subjects: Civil Engineering; Environmental Engineering; Fluid Dynamics
  • 4. Weaver, Josh The Self-Optimizing Inverse Methodology for Material Parameter Identification and Distributed Damage Detection

    Master of Science, University of Akron, 2015, Civil Engineering

    Understanding and predicting the behavior of structures under specific operating conditions is a fundamental task of structural engineers. Scientific principles are used to model the characteristics of a material's response to these various mechanical loads. Using experimental data, constitutive models can be created that provide a mathematical description of a materials response. However, these constitutive models require numerous parameters to be identified. In order to calculate these parameters, inverse parameter identification algorithms can be used. These constitutive models apply a homogenous distribution of the material parameters across a structural component. However, in reality there is often a heterogeneous distribution of these material parameters across the structure. This can be due to a variety of reasons including the characteristics of the raw material, geometry, manufacturing processes, fatigue and damage. In order to model this heterogeneous distribution, stochastic methods can be deployed. In this research, an inverse parameter identification method known as the Self-Optimizing Inverse Methodology (Self-OPTIM) will be used to create a powerful and easy to use software framework for parameter identification. This software framework includes capabilities to parallelize finite element simulation to reduce the time of optimization. In addition, this framework will include a stochastic methodology that can be used to model heterogeneous distributions of material parameters across a structural component. Using this software, the capabilities of Self-OPTIM will be tested on various constitutive models to demonstrate its ease of use as well as its superiority to other methods using boundary information as its primary input.

    Committee: Gunjin Yun Dr. (Advisor); Robert Goldberg Dr. (Committee Member); Weislaw Binienda Dr. (Committee Member) Subjects: Civil Engineering
  • 5. Shivashankar, Nithin Design and Analysis of Modular Architectures for an RNS to Mixed Radix Conversion Multi-processor

    MS, University of Cincinnati, 2014, Engineering and Applied Science: Computer Engineering

    Cryptography is the study of techniques for secure communication in the presence of unknown threats or adversaries. It is used to hide data that is being exchanged between two or more concerned parties from the view of external parties. It is used to encode communications such as e-mail, telephones, bank transactions, credit card transactions, electronic signatures, military applications, etc. In this thesis we develop a Verilog design which implements algorithms for secure information exchange. Our design uses a Residue Number System which is then converted into a Mixed Radix Number System. Several algorithms are used to implement these computations in Verilog. We analyze our design for area, power and timing by simulating its performance using the Altera Modelsim and Quartus II tools.

    Committee: Carla Purdy Ph.D. (Committee Chair); Raj Bhatnagar Ph.D. (Committee Member); George Purdy Ph.D. (Committee Member) Subjects: Computer Engineering
  • 6. Dosopoulos, Stylianos Interior Penalty Discontinuous Galerkin Finite Element Method for the Time-Domain Maxwell's Equations

    Doctor of Philosophy, The Ohio State University, 2012, Electrical and Computer Engineering

    This dissertation, investigates a discontinuous Galerkin (DG) methodology to solve Maxwell's equations in the time-domain. More specifically, we focus on a Interior Penalty (IP) approach to derive a DG formulation. In general, discontinuous Galerkin methods decompose the computational domain into a number of disjoint polyhedral (elements). For each polyhedron, we define local basis functions and approximate the fields as a linear combination of these basis functions. To ensure equivalence to the original problem the tangentially continuity of the electric and magnetic fields need to be enforced between polyhedra interfaces. This condition is applied in the weak sense by proper penalty terms on the variational formulation also known as numerical fluxes. Due to this way of coupling between adjacent polyhedra DG methods offer great flexibility and a nice set of properties such as, explicit time-marching, support for non-conformal meshes, freedom in the choice of basis functions and high efficiency in parallelization. Here, we first introduce an Interior Penalty (IP) approach to derive a DG formulation and a physical interpretation of such an approach. This physical interpretation will provide a physical insight into the IP method and link important concepts like the duality pairing principle to a physical meaning. Furthermore, we discuss the time discretization and stability condition aspects of our scheme. Moreover, to address the issue of very small time steps in multi-scale applications we employ a local time-stepping (LTS) strategy which can greatly reduce the solution time. Secondly, we present an approach to incorporate a conformal Perfectly Matched Layer (PML) in our interior penalty discontinuous Galerkin time-domain (IPDGTD) framework. From a practical point of view, a conformal PML is easier to model compared to a Cartesian PML and can reduce the buffer space between the structure and the truncation boundary, thus potentially reducing the number of unkno (open full item for complete abstract)

    Committee: Jin-Fa Lee (Advisor); Teixeira Fernando (Committee Member); Krishnamurthy Ashok (Committee Member) Subjects: Electrical Engineering; Electromagnetics
  • 7. Nagavaram, Ashish Cloud Based Dynamic Workflow with QOS For Mass Spectrometry Data Analysis

    Master of Science, The Ohio State University, 2011, Computer Science and Engineering

    Lately, there is a growing interest in the use of cloud computing for scientific applications, including scientific workflows. Key attractions of cloud include the pay-as-you-go model and elasticity. While the elasticity offered by the clouds can be beneficial for many applications and use-scenarios, it also imposes significant challenges in the development of applications or services. For example, no general framework exists that can enable a scientific workflow to execute in a dynamic fashion with QOS (Quality of Service) support, i.e. exploiting elasticity of clouds and automatically allocating and de-allocating resources to meet time and/or cost constraints while providing the desired quality of results the user needs. This thesis presents a case-study in creating a dynamic cloud workflow implementation with QOS of a scientific application. We work with MassMatrix, an application which searches proteins and peptides from tandem mass spectrometry data. In order to use cloud resources, we first parallelize the search method used in this algorithm. Next, we create a flexible workflow using the Pegasus Workflow Management System from ISI. We then add a new dynamic resource allocation module, which can use fewer or a larger number of resources based on a time constraint specified by the user. Finally we extend this to include the QOS support to provide the user with the desired quality of results. We use the desired quality metric to calculate the values of the application parameters. The desired quality metric refers to the parameters that are computed to maximize the user specified benefit function while meeting the time constraint. We evaluate our implementation using several different data-sets, and show that the application scales quite well. Our implementation effectively allocates resources adaptively and the parameter prediction scheme is successful in choosing parameters that help meet the time constraint.

    Committee: Gagan Agrawal PhD (Advisor); Rajiv Ramnath PhD (Committee Member); Michael Freitas PhD (Committee Member) Subjects: Bioinformatics; Biomedical Engineering; Biomedical Research; Computer Engineering; Computer Science
  • 8. Singri, Arjun Compile-Time Characterization of Recurrent Patterns in Irregular Computations

    Master of Science, The Ohio State University, 2010, Computer Science and Engineering

    There are compiler techniques using which efficient communication statements can be generated with user supplied data-distributions. These techniques work for regular array access functions but many applications use irregular access functions. It is difficult to generate efficient communication statements for irregular accesses as compile time characterization of the computation structure of such applications is infeasible. In many applications, the irregular computational patterns recur a number of times during execution. This recurring section of an application can be analyzed once at run time to determine its computation structure and then the information collected can be used to generate an efficient communication schedule for it. This model of compiling a section of a program is an example of Inspector-Executor compilation. This thesis aims at identifying such sections in a program automatically at compile-time. The algorithms needed for this are implemented using the Low Level Virtual Machine (LLVM) compiler framework and are used to detect such sections in the SPEC benchmarks. The relative running time of all such sections is calculated with respect to that of the entire program by using HPCToolkit.

    Committee: Ponnuswamy Sadayappan PhD (Advisor); Atanas Rountev PhD (Advisor) Subjects: Computer Science
  • 9. Hartono, Albert Tools for Performance Optimizations and Tuning of Affine Loop Nests

    Doctor of Philosophy, The Ohio State University, 2009, Computer Science and Engineering

    Multicore processors have become mainstream and the number of cores in a chip will continue to increase every year. Programming these architectures to effectively exploit their very high computation power is a non trivial task. First, an application program needs to be explicitly restructured using a set of code transformation techniques to optimize for specific architectural features, especially for parallelism and data locality. Then a significant amount of time is spent on tuning the optimized code to find the best optimization parameter values. However, high performance often means lower productivity as the optimized codes become difficult to understand, maintain and modify. In this dissertation, we present techniques to address these issues by automatic generation of efficient parallel programs, and by the use of empirical search for tuning. The research from this dissertation has been implemented and made publicly available as two useful software tools: one for parameterized tiled loop generation, and one for empirical performance tuning using annotations.Tiling is a critical loop transformation that optimizes both for data locality enhancement as well as coarse-grained parallel execution on many-core systems. Parameterized tiled code refers to tiled loops where the tile sizes are not fixed compile-time constants. It is important for auto-tuning systems since they often execute a large number of runs with dynamically varied tile sizes. Multi-level tiling is essential for maximizing data locality in systems with deep memory hierarchies. Previous approaches to tiled code generation have addressed parametric tiling but have been restricted to perfect loop nests and sequential execution. Prior parallel tiling solutions can handle imperfect loop nests but are only applicable to compile-time constant tile sizes. We develop systematic solutions to parameterized multi-level tiling of arbitrary imperfectly nested affine loops for both sequential and parallel execution. (open full item for complete abstract)

    Committee: P. Sadayappan (Advisor); Atanas Rountev (Committee Member); Dhabaleswar K. Panda (Committee Member); J. Ramanujam (Committee Member) Subjects: Computer Science
  • 10. Bondhugula, Uday Kumar Effective Automatic Parallelization and Locality Optimization Using The Polyhedral Model

    Doctor of Philosophy, The Ohio State University, 2008, Computer Science and Engineering

    Multicore processors have now become mainstream. The difficulty ofprogramming these architectures to effectively tap the potential of multiple processing units is well-known. Among several ways of addressing this issue, one of the very promising and simultaneously hard approaches is Automatic Parallelization. This approach does not require any effort on part of the programmer in the process of parallelizing a program. The Polyhedral model for compiler optimization is a powerful mathematical framework based on parametric linear algebra and integer linear programming. It provides an abstraction to represent nested loop computation and its data dependences using integer points in polyhedra. Complex execution-reordering, that can improve performance by parallelization as well as locality enhancement, is captured by affine transformations in the polyhedral model. With several recent advances, the polyhedral model has reached a level of maturity in various aspects – in particular, as a powerful intermediate representation for performing transformations, and code generation after transformations. However, an approach to automatically find good transformations for communication-optimized coarse-grained parallelization together with locality optimization has been a key missing link. This dissertation presents a new automatic transformation framework that solves the above problem. Our approach works by finding good affine transformations through a powerful and practical linear cost function that enables efficient tiling and fusion of sequences of arbitrarily nested loops. This in turn allows simultaneous optimization for coarse-grained parallelism and locality. Synchronization-free parallelism and pipelined parallelism at various levels can be extracted. The framework can be targeted to different parallel architectures, like general-purpose multicores, the Cell processor, GPUs, or embedded multiprocessor SoCs. The proposed framework has been impl (open full item for complete abstract)

    Committee: Prof. P Sadayappan (Advisor); Dr. Atanas Rountev (Committee Member); Prof. Gagan Agrawal (Committee Member); Prof. J Ramanujam (Committee Member) Subjects: Computer Science
  • 11. Krishnamoorthy, Sriram Optimizing locality and parallelism through program reorganization

    Doctor of Philosophy, The Ohio State University, 2008, Computer and Information Science

    Development of scalable application codes requires an understanding and exploitation of the locality and parallelism in the computation. This is typically achieved through optimizations by the programmer to match the application characteristics to the architectural features exposed by the parallel programming model. Partitioned address space programming models such as MPI foist a process-centric view of the parallel system, increasing the complexity of parallel programming. Typical global address space models provide a shared memory view that greatly simplifies programming. But the simplified models abstract away the locality information, precluding optimized implementations. In this work, we present techniques to reorganize program execution to optimize locality and parallelism, with little effort from the programmer. For regular loop-based programs operating on dense multi-dimensional arrays, we propose an automatic parallelization technique that attempts to determine a parallel schedule in which all processes can start execution in parallel. When the concurrent tiled iteration space inhibits such execution, we present techniques to re-enable it. This is an alternative to incurring the pipelined startup overhead in schedules generated by prevalent approaches. For less structured programs, we propose a programming model that exposes multiple levels abstraction to the programmer. These abstractions enable quick prototyping coupled with incremental optimizations. The data abstraction provides a global view of distributed data organized as blocks. A block is a subset of data stored contiguously in a single process' address space. The computation is specified as a collection of tasks operating on the data blocks, with parallelism and dependence being specified between them. When the blocking of the data does not match the required access pattern in the computation, the data needs to be reblocked to improve spatial locality. We develop efficient data layout transformati (open full item for complete abstract)

    Committee: P Sadayappan (Advisor) Subjects: Computer Science