Skip to Main Content

Basic Search

Skip to Search Results
 
 
 

Left Column

Filters

Right Column

Search Results

Search Results

(Total results 169)

Mini-Tools

 
 

Search Report

  • 1. Khan, Mahfizur Rahman Distributed UAV-Based Wireless Communications Using Multi-Agent Deep Reinforcement Learning

    Master of Science, Miami University, 2024, Electrical and Computer Engineering

    In this thesis, a thorough investigation into the optimization of user connectivity in ad hoc communication networks using robust policy creation and intelligent UAV location in stochastic environments is presented. In order to handle the dynamic and decentralized character of ad hoc networks, we identified the optimal UAV positions by applying a multi-agent deep Q-learning technique. To train stochastic environment-adaptive policies, a novel simple algorithm was devised with an emphasis on the usefulness of these policies under different scenarios. Through an empirical investigation, the study offered information on the generalizability and adaptability of learnt behaviors by examining how well policies based on one distribution of settings performed when applied to different, unseen distributions. In this thesis, we also explored the resilience of UAV networks against jamming attempts and propose a method for unaffected UAVs to self-adjust their placements. This approach ensured optimal user coverage even in adversarial situations. By demonstrating the potential of machine learning techniques to maximize network performance and enhance user connectivity in the face of environmental uncertainties and security risks, these contributions will collectively advance the field of UAV-assisted communication.

    Committee: Dr. Bryan Van Scoy (Advisor); Dr. Mark Scott (Committee Member); Dr. Veena Chidurala (Committee Member) Subjects: Computer Engineering; Electrical Engineering
  • 2. Goldblatt, John Model-Free Reinforcement Learning for Hierarchical OO-MDPs

    Master of Sciences, Case Western Reserve University, 2022, EECS - Computer and Information Sciences

    This thesis studies Object-Oriented Markov Decision Processes (OO-MDPs), which extend MDPs with prior knowledge about the shared dynamics of similar objects in the environment. Existing work presents model-based algorithms that leverage the properties of OO-MDPs and adhere to the Knows What It Knows (KWIK) framework. In practice, models may not be easy to estimate and the KWIK framework may still lead to slow performance in a reinforcement learning context. In this thesis, I first introduce a new model-free learning algorithm for OO-MDPs based on Q-Learning. Though my approach is not KWIK, I show empirically that it exhibits significantly faster convergence than the KWIK and flat baselines. Next, I extend hierarchical reinforcement learning (HRL) to use OO-MDPs in the same manner. HRL uses a task hierarchy as prior information to reduce the overall problem into a set of smaller tasks. I show that HRL and OO-MDPs have a natural synergy, and I propose a novel model-free OO-HRL algorithm. I show empirically that this algorithm has better sample complexity than either HRL or OO-MDP algorithms alone.

    Committee: Soumya Ray (Advisor); Michael Lewicki (Committee Member); Harold Connamacher (Committee Member); M. Cenk Cavusoglu (Committee Member) Subjects: Artificial Intelligence; Computer Science
  • 3. Miller, Eric Biased Exploration in Offline Hierarchical Reinforcement Learning

    Master of Sciences, Case Western Reserve University, 2021, EECS - Computer and Information Sciences

    A way of giving prior knowledge to a reinforcement learning agent is through a task hierarchy. When collecting data for offline learning with a task hierarchy, the structure of the hierarchy determines the distribution of data. In some cases, the hierarchy structure causes the data distribution to be skewed so that learning an effective policy from the collected data requires many samples. In this thesis, we address this problem. First, we determine the conditions when the hierarchy structure will cause some actions to be sampled with low probability, and describe when this sampling distribution will delay convergence. Second, we present three biased sampling algorithms to address the problem. These algorithms employ the novel strategy of exploring a different hierarchical MDP than the one in which the policy is to be learned. Exploring in these new MDPs improves the sampling distribution and the rate of convergence of the learned policy to optimal in the original MDP. Finally, we evaluate all of our methods and several baselines on several different reinforcement learning problems. Our experiments show that our methods outperform the baselines, often significantly, when the hierarchy has a problematic structure. Furthermore, they identify trade-offs between the proposed methods and suggest scenarios when each method should be used.

    Committee: Soumya Ray (Advisor); Cenk Cavusoglu (Committee Member); Michael Lewicki (Committee Member); Harold Connamacher (Committee Member) Subjects: Artificial Intelligence; Computer Science
  • 4. Turkoglu, Altan Multi-Agent Reinforcement Learning and Information Sharing

    Master of Science, The Ohio State University, 2023, Electrical and Computer Engineering

    Multi-Agent Reinforcement Learning (MARL) faces numerous challenges in partially-observable environments due to limited information and the non-stationarity of the learning process. This study aims to determine the optimal rate and type of information sharing in MARL algorithms to improve performance in such environments. Here, we benchmark the Independent Q-Learning (IQL) and Independent Proximal Policy Optimization (IPPO) algorithms, modifying them to facilitate various forms of information sharing in the Level-Based Foraging environment. Our experimental results reaffirm theorized downsides of na¨ıve information sharing from scientific literature and demonstrate that optimal sharing rates may be an additional hyperparameter for optimization: notably, when allowing agents to choose when to share information, models perform competitively while converging to rates of sharing as low as 10% while achieving improved short-term returns. Additionally, we find that pairing information about agents' actions with their observations yields the highest returns, possibly allowing agent networks to parameterize each others' policies.

    Committee: Jia Liu (Committee Member); Parinaz Naghizadeh (Advisor) Subjects: Computer Engineering; Computer Science; Electrical Engineering
  • 5. Regatti, Jayanth Reddy Learning at the Edge under Resource Constraints

    Doctor of Philosophy, The Ohio State University, 2023, Electrical and Computer Engineering

    Recent decades saw a huge increase in the number of personal devices, wearables, edge devices, etc which led to increased data collection and increased connectivity at the edge. This collected data can be used to make insights about health, the economy, and business and help us make better decisions at the individual, organizational and global levels. With the proliferation of these devices, there are also numerous challenges associated with making use of these devices and the data to train useful models. The challenges could be due to privacy regulations or other constraints determined by the particular learning setup. These constraints make it difficult to extract the required insights from the data and the edge systems. The goal of this thesis is to understand these challenges or resource constraints and develop efficient algorithms that enable us to train models while adhering to the constraints. This thesis makes the following contributions: 1. Propose an efficient algorithm FedCMA for model heterogeneous Federated Learning under resource constraints, showed the convergence and generalization properties, and demonstrated the efficacy against state-of-the-art algorithms in the model heterogeneity setting. 2. Proposed a two-timescale aggregation algorithm that does not require the knowledge of the number of adversaries for defending against Byzantine adversaries in the distributed setup, proved the convergence of the algorithm, and demonstrated the defense against state-of-the-art attacks. 3. We highlight the challenges posed by resource constraints in the Offline Reinforcement Learning setup where the observation space during inference is different from the observation space during training. We propose a simple algorithm STPI (Simultaneous Transfer Policy Iteration) to train the agent to adapt to the changes in the observation space and demonstrated the effectiveness of the algorithm on MuJoCo environments against simple baselines.

    Committee: Abhishek Gupta (Advisor); Ness Shroff (Advisor) Subjects: Computer Engineering; Computer Science; Electrical Engineering; Engineering
  • 6. Zhu, Tianxing Deep Reinforcement Learning for Open Multiagent System

    BA, Oberlin College, 2022, Computer Science

    In open multiagent systems, multiple agents work together or compete to reach the goal while members of the group change over time. For example, intelligent robots that are collaborating to put out wildfires may run out of suppressants and have to leave the place to recharge; the rest of the robots may need to change their behaviors accordingly to better control the fires. Thus, openness requires agents not only to predict the behaviors of others, but also the presence of other agents. We present a deep reinforcement learning method that adapts the proximal policy optimization algorithm to learn the optimal actions of an agent in open multiagent environments. We demonstrate how openness can be incorporated into state-of-the-art reinforcement learning algorithms. Simulations of wildfire suppression problems show that our approach enables the agents to learn the legal actions.

    Committee: Adam Eck (Advisor) Subjects: Artificial Intelligence; Computer Science
  • 7. Fettes, Quintin Optimizing Power Consumption, Resource Utilization, and Performance for Manycore Architectures using Reinforcement Learning

    Doctor of Philosophy (PhD), Ohio University, 2022, Electrical Engineering & Computer Science (Engineering and Technology)

    As process technology and transistor size continue to shrink into the sub-nanometer regime, the number of cores that can be integrated on-chip continues to increase exponentially. Due to the power wall, which has stalled increases in processor clock frequency indefinitely, computer architects have realized that improving performance with billions of transistors is possible only with manycores operating at lower clock frequencies. It is often the case that realizing the full performance and efficiency benefits offered by manycore architectures requires solving a host of relatively complex resource management issues, which are only present when computation is parallelized. However, the complexity of these resource management tasks often increases with the parallelism of the architectures themselves, and traditional, manually-engineered algorithms leave an ever-increasing amount of performance and efficiency benefits on the table. With the goal of exploiting maximum parallelism in manycore architectures to improve execution time, reduce energy consumption, and efficiently utilize hardware resources, this dissertation focuses on utilizing reinforcement learning on three critical resource allocation problems within the computing stack: network-on-chip-level dynamic voltage and frequency scaling, thread migration to reduce on-chip data movement, and CPU core allocation to microservices. These data-driven solutions will more easily scale-out as they automatically learn policies for resource management, which would be too time consuming or impractical for human engineers to design. At the network-on-chip level, I propose training a low-overhead, offline reinforcement learning algorithm to change the frequency of links and routers such that data will be sent along the network-on-chip using as little energy as possible without negatively affecting throughput. At the chip level, I propose using low overhead reinforcement learning to learn a thread migration policy. This (open full item for complete abstract)

    Committee: Razvan Bunescu (Advisor); Avinash Karanth (Advisor); Martin Mohlenkamp (Committee Member); Jundong Liu (Committee Member); Wei Lin (Committee Member); David Chelberg (Committee Member) Subjects: Artificial Intelligence; Computer Engineering; Computer Science
  • 8. Sah, Suba Nuclear Renewable Integrated Energy System Power Dispatch Optimization for Tightly Coupled Co-Simulation Environment using Deep Reinforcement Learning

    Master of Science, University of Toledo, 2021, Engineering (Computer Science)

    To achieve a reduction in carbon emission, researchers are looking for new methods to connect energy systems to enhance efficiency. Due to a large influx of variable and distributed energy resources in the electricity market of the U.S., there are significant alterations in the net electricity demand. And since the traditional nuclear power generation is inflexible, its supply cannot be aligned with the demand which in turn impacts its economic viability. The Nuclear Renewable Integrated Energy System (NR-IES) is a leading solution that integrates nuclear power plants, renewable energy, hydrogen generation plants, and energy storage systems so that thermal and electrical power can be dispatched to meet full grid flexible requirements while also producing hydrogen and maximizing revenue. This thesis introduces a Deep Reinforcement Learning (DRL) based framework to address the challenging decision-making for NR-IES. The goal is to maximize revenue by concurrently producing and selling hydrogen and electricity at different prices while maintaining energy flow in subsystems balanced. To enable an efficient and flexible computational framework for DRL's research and development, an FMI/FMU based co-simulation environment for NR-IES simulation has been developed to integrate the OpenAI Gym and Ray/RLLib. Two state-of-the-art DRL algorithms, namely Soft Actor-Critic (SAC), and Proximal Policy Optimization (PPO) have been investigated to demonstrate DRL's superiority in controlling NR-IES.

    Committee: Dr. Raghav Khanna (Committee Chair); Dr. Devinder Kaur (Committee Member); Dr. Ahmad Javaid (Committee Co-Chair) Subjects: Computer Engineering; Computer Science; Energy; Sustainability
  • 9. Baheri, Betis MARS: Multi-Scalable Actor-Critic Reinforcement Learning Scheduler

    MS, Kent State University, 2020, College of Arts and Sciences / Department of Computer Science

    In this thesis we introduce a new scheduling algorithm MARS based on a cost-aware multi-scalable reinforcement learning approach, which serves as an intermediate layer between HPC resource manager and user application workflow. MARS ensembles the pre-generated models from users' workflows and decides on the most suitable strategy for optimization. A whole workflow application would be split into several optimized sub-tasks. Then, based on a pre-defined resource management plan, a reward will be generated after executing a scheduled task. Lastly, MARS updates the Deep Neural Network (DNN) model for future use. MARS is designed to be able to optimize the existing models through reinforcement mechanism. MARS can adapt to the shortage of training samples by optimizing the performance through combining small tasks together or switching between pre-built scheduling strategies such as Backfilling, SJF, etc., and choosing the most suitable approach. After testing MARS using different real-world workflow traces, results shows that MARS can achieve between 5%-60% better performance against the other approaches.

    Committee: Qiang Guan Dr. (Advisor); Feodor Dragan Dr. (Committee Member); Rouming Jin Dr. (Committee Member) Subjects: Computer Science
  • 10. Yang, Zhaoyuan Adversarial Reinforcement Learning for Control System Design: A Deep Reinforcement Learning Approach

    Master of Science, The Ohio State University, 2018, Electrical and Computer Engineering

    We adapt idea of adversarial reinforcement learning to numerical state inputs of controllers. We propose an idea of generating adversarial noises for inputs of controllers using deep reinforcement learning. We also propose an idea of using reinforcement learning agent as observer and using observer to reduce effect of adversarial noise. Idea of using reinforcement learning as observer may be helpful for adapting knowledge from simulation to real world. We performed a sequence of analyses about adversarial reinforcement learning and deep reinforcement learning. Through analysis, we discover deep reinforcement learning agent learnt in ideal environment is not robust to adversarial noise and learning in adversarial environment will make agent robust in both adversarial and non-adversarial environment. We make several conjectures about phenomena we observe, and propose an idea of how to let deep reinforcement learning agent better use state information. We also propose an idea of how to use neural network to find policies optimize cost objective automatically. In the end, we discuss possible works could be done in the future.

    Committee: Abhishek Gupta (Advisor); Wei Zhang (Committee Member) Subjects: Artificial Intelligence; Computer Science; Electrical Engineering; Engineering
  • 11. Crawford, Marianna The student teacher's use of verbal reinforcement in the classroom.

    Master of Science, The Ohio State University, 1972, Graduate School

    Committee: Not Provided (Other) Subjects:
  • 12. Inks, Lawrence An examination of the effects of feedback sharing and self-assessment knowledge on distortion in performance evaluations under conditions of high and low outcome dependence /

    Master of Arts, The Ohio State University, 1986, Graduate School

    Committee: Not Provided (Other) Subjects:
  • 13. Short, Jo. The effects of reinforcing and modeling related verbal behaviors on mealtime behaviors of emotionally disturbed children.

    Master of Arts, The Ohio State University, 1982, Graduate School

    Committee: Not Provided (Other) Subjects:
  • 14. Holsinger, Adam Deep Learning-Driven Innovations in PPG Sensing Technology

    Master of Computer Science (M.C.S.), University of Dayton, 2024, Computer Science

    Photoplethysmogram (PPG) signals are a highly versatile biological signal that can be extracted from a subject and used for several purposes including health monitoring and continuous authentication. Interest in PPG extraction techniques has been growing recently, especially in the field of remote PPG (rPPG), which seeks to extract PPG signals from a subject with no contact device. These rPPG models process the video in various ways to distill it to its most relevant information for cardiac signal extraction, but artifacts caused by subject motions and lighting conditions in the video remain a problem due to distortions in the waveform, making heart rate extraction difficult. In the first project of this thesis, we propose a noise-aware post-processor network that takes generated position, head pose, and luminance information from the video as noise-correlating signals to be used to denoise an rPPG signal generated by an existing base network. We show effective results on two separate datasets, PURE and MMPD, while using two different representative base models, DeepPhys and PhysNet, in conjunction with our post-processor, reducing mean absolute error by an average of 26\% across all tests. Additionally, we develop novel techniques for contact PPG sensor manipulation and control. The ability to control a physical PPG sensor has wide-ranging applications in the PPG space, from the testing of medical devices or continuous authentication systems to performing presentation attacks. In the second project of this thesis, we devise a system to teach a reinforcement learning agent how to manipulate a physical PPG sensor to match a target signal using Proximal Policy Optimization (PPO). We show strong results on three representative signals and discuss effective training and reward strategies to overcome the difficulties presented by each signal.

    Committee: Tianming Zhao (Committee Chair); David Kapp (Committee Member); Temesgen Kebede (Committee Member); Zhongmei Yao (Committee Member) Subjects: Computer Science
  • 15. Rugerinyange, Aime Regis Enhancing Compressive Properties Of Sls-Printed Nylon Lattice Structures Using Thermoset Reinforcement Coatings And Graphene Nanofillers Integration

    Master of Science, Miami University, 2024, Mechanical and Manufacturing Engineering

    Selective laser sintering (SLS) technique has emerged as an important method in additive manufacturing, facilitating the manufacturability of complex lattice structures, known for their high stiffness-to-weight ratios. However, these structures face mechanical limitations, such as low compressive strength and energy absorption, restricting their use in demanding industries like aerospace and automotive. This study addresses these challenges by reinforcing SLS-printed Nylon 12 (Polyamide 12, PA12) lattice structures with thermoset resins (Bisphenol A, BPA epoxy), forming layered composites that significantly improve compressive performance. A continuous rotation coating technique was introduced to overcome the uneven reinforcement observed in traditional dip-coating methods, achieving a uniform resin distribution. The optimized coating method resulted in a 13% improvement in compressive yield strength compared to dip-coated samples, contributing to an overall 139% increase relative to unreinforced PA12. Further enhancement was achieved through the incorporation of functionalized graphene nanofillers into the PA12/thermoset matrix, with the optimal configuration (68:32 PA12-to-BPA epoxy ratio with 0.1 wt% graphene) yielding a 201% increase in compressive yield strength and a 154% increase in specific energy absorption. Image analysis confirmed improved adhesion, and improved structural integrity at the samples with optimal configuration. Findings from this study provide a pathway for industrial applications of SLS-printed lattice structures, enabling lightweight, high-strength components for aerospace and automotive industries.

    Committee: Muhammad Jahan (Advisor); Kumar Singh (Committee Member); Jinjuan She (Committee Member); Yingbin Hu (Committee Member) Subjects: Aerospace Materials; Automotive Materials; Engineering; Materials Science; Mechanical Engineering
  • 16. Hemmelgarn, Jessica A components analysis of video modeling and reinforcement of social interaction during game playing of children with autism /

    Master of Arts, The Ohio State University, 2006, Graduate School

    Committee: Not Provided (Other) Subjects:
  • 17. Morisano, Elaine Light-increment as a sensory reinforcer in the redwinged blackbird /

    Master of Arts, The Ohio State University, 1971, Graduate School

    Committee: Not Provided (Other) Subjects:
  • 18. Rodriguez, Vanessa Evaluating the effects of play skills training on stimulus preference in individuals with severe to profound mental retardation /

    Master of Arts, The Ohio State University, 2005, Graduate School

    Committee: Not Provided (Other) Subjects:
  • 19. Casper, Karianne A comparison on the effects of a job card procedure and contingent reinforcement on the transition behaviors of preschoolers with special needs /

    Master of Arts, The Ohio State University, 2006, Graduate School

    Committee: Not Provided (Other) Subjects:
  • 20. Schad, Robert Building Neural Maps of Motor Primitives via Self-Organization and Reinforcement Learning

    MS, University of Cincinnati, 2024, Engineering and Applied Science: Computer Science

    Complex animals such as mammals are thought to construct high level movement via a motor control hierarchy, where more complex movements are constructed by combining modular motor primitives at lower levels. The primitives at the lowest level are likely to be short sequences of simple movements. Learning the primitives and a control hierarchy is a complex task in a high degree-of-freedom (DOF) system. Experimental results have shown evidence for a spatially-organized repertoire of movements in the mammalian cortex. Drawing inspiration from this idea, this thesis presents a simple, biologically-inspired neural network model for learning a motor primitive map using reinforcement learning rather than explicit supervised learning. The map self-organizes static code vectors that, via a recurrently-connected output layer, are decoded into spatiotemporal behavior representing primitive movements of an organism, which can be “chained” together to generate a wide variety of complex, sophisticated movements.

    Committee: Ali Minai Ph.D. (Committee Chair); John Gallagher Ph.D. (Committee Member); Raj Bhatnagar Ph.D. (Committee Member) Subjects: Computer Science