Over the past decade, the pervasive use of object-oriented languages and the increasing complexity of problems solved by computer software have led to the proliferation of large-scale framework-intensive applications. These applications are typically built by combining standard packages (e.g., J2EE application server frameworks), third-party layers for domain-specific functionality, and in-house solutions. While employing libraries and frameworks eases the development effort, it can be expensive to invoke the general APIs provided by these libraries, especially when they are used for simple tasks. As a result, many applications suffer from excessive memory footprint caused by chronic runtime bloat that significantly impacts scalability and performance. In addition, programmers are taught to pay more attention to abstractions and models, and to leave the performance optimization to the runtime system. While a few redundant objects, calls, and field copies may seem insignificant, the problem quickly gets exacerbated due to nesting and layering. At some point, these inefficiencies accumulate, and compilers (e.g., the JIT in a JVM) can no longer eliminate them, since layers of abstractions grow to be deep, dynamic, and well beyond the capabilities of compiler analysis.
In this dissertation, the term bloat is used to refer to the general phenomenon of using excessive work and memory to achieve seemingly simple tasks. Bloat exists commonly in large-scale object-oriented programs and it is a major challenge that stands in the way of bridging the productivity-performance gap between managed and unmanaged languages. The overarching goal of our work is to find large bloat-related optimization opportunities, with a small amount of developer time. As a fundamental methodology to achieve this goal, we advocate tool-assisted manual optimization, in order to combine developer insight with the automated tool support.
In particular, we have designed, implemented, and evaluated code checking techniques accomplished by bytecode-level static analysis, and runtime optimization techniques achieved by VM-level dynamic analysis, for detecting and removing runtime bloat in large framework-intensive Java applications. These techniques can be used to (1) help a programmer detect the root cause if a problem is seen; (2) remove frequently-occurring bloat patterns; and (3) prevent bloat from occurring in an early stage of development.
Incorporating these techniques in an existing Java compiler or virtual machine will lead to increased developer productivity by helping programmers quickly locate bloat-related performance bottlenecks, as well as improved scalability of large programs by optimizing away bloat at run time. In addition, it is possible to generalize these approaches to handle similar problems in programs written in other object-oriented languages (e.g., C#), so that non-Java communities would also benefit from them. While these techniques are designed specifically to solve the bloat problem, many of them are applications of more general (theoretical) frameworks developed in the dissertation that can be instantiated to solve other problems.
Bloat contains wasteful operations that, while not strictly necessary for the forward progress, are executed nevertheless. We propose novel dynamic analysis techniques to detect such wasteful operations and to produce information that is necessary for the programmer to pinpoint the performance bottlenecks. One such analysis, as the first contribution of this dissertation, is copy profiling. This analysis is designed based on an important observation that the wasteful operations often consist of data copy activities that move data among heap locations without any useful computation. By profiling copies, this analysis looks for program regions containing large volumes of copies and data structures whose construction involves data copied frequently from other data structures.
Different from this “from-symptom-to-cause” approach that finds bloat from the symptoms through which it manifests, the second dynamic analysis this dissertation advocates attempts to capture directly the center of bloat, which is the set of operations that, while expensive to execute, produce values of little benefit. Detecting the operations that have high cost-benefit rates is, thus, an important step towards tracking down the causes of performance problems. In order to compute cost and benefit efficiently, we propose a novel technique, called abstract dynamic thin slicing, that performs dynamic thin slicing over bounded abstract domains. We demonstrate, using real-world examples, that this technique can also be adopted to solve a range of backward data flow problems efficiently. With the help of a variety of data aggregation approaches, these analyses can help a programmer quickly find potential performance problems.
The accumulation of bloat effects may result in memory leaks, which occur when object references that are no longer needed are unnecessarily maintained. Existing dynamic analyses for leak detection track fine-grained information about individual objects, producing results that are hard to interpret and lack precision. The third contribution of the dissertation is a novel container-based heap-tracking technique, based on the observation that many memory leaks in Java programs occur due to containers that keep references to unused data entries. By profiling containers and understanding their semantics, it is much easier to track down the causes of memory leak problems, compared to existing leak detection approaches based on the tracking of arbitrary objects.
Although container profiling is effective in detecting container-induced memory leaks, it has difficulties dealing with leaks that are caused by general unnecessary references instead of containers. In order to help programmers identify the root causes of these general leaks, we propose a specification-based dynamic technique called LeakChaser, as the fourth contribution of this dissertation. LeakChaser brings high-level application semantics into low-level leak detection by allowing programmers to specify and infer object liveness properties. This new technique exploits object lifetime relationships and uses varying levels of abstraction to help both experts and novices quickly explore the leaky behavior to pinpoint the leak cause.
Using these four dynamic analyses, we have found many interesting bloat patterns that can be regularly observed in the execution of large Java programs. A further step to avoid bloat is to develop static analyses that can find and remove such patterns during application development, so that small performance issues can be prevented before they accumulate and become significant.
One interesting pattern is the inefficient use of Java containers. The fifth contribution of this dissertation is a static analysis that identifies inefficiencies in the use of containers, regardless of inputs and runs. Specifically, this static analysis detects underutilized and overpopulated containers by employing a context-free-language (CFL)-reachability formulation of container operations, and by exploiting container-specific properties. The analysis is client-driven and demand-driven. It always generates highly-precise reports, but trades soundness for scalability. We show that this analysis exhibits small false positive rates, and large optimization opportunities can be found by inspecting the generated reports.
Another bloat pattern that we have regularly observed is constructing and initializing data structures that are invariant across loop iterations. As the sixth contribution of this dissertation, we develop a static analysis that uses a type and effect system to help programmers find such loop-invariant data structures. Instead of transforming a program to hoist these data structures automatically (which must be over-conservative and thus becomes ineffective in practice), we advocate a semi-automated approach that uses the static analysis to compute hoistability measurements. These measurements indicate how likely it is that these data structures can be hoisted, and are presented to the user for manual inspection. Our experimental results indicate that the proposed technique can be useful both in the development phase (for finding small performance issues before they accumulate) and in performance tuning (for identifying significant performance bottlenecks).
In conclusion, despite the existence of a number of profiling tools, this dissertation attempts to establish more systematic ways of identifying run-time inefficiencies. The dynamic analyses presented in the dissertation are implemented in J9 (a commercial JVM developed by IBM) and Jikes RVM (an open-source JVM written in Java). The static analyses are implemented in Soot, a popular Java program analysis framework. All of the proposed analyses have been shown to scale to real-world Java applications. Our experimental results strongly suggest that these techniques can be used in practice to find and remove bloat in large-scale applications, leading to performance gains. We hope that with the help of the analyses we have developed,
performance tuning could be made much easier and will no longer be a daunting task that requires special skills and experience. Developers should be able to easily understand performance and perform optimizations, when they are assisted by good tools and do not need to focus on every low-level detail of the execution behavior and the analysis process. The productivity-performance gap between managed languages and unmanaged languages could be further reduced by using these techniques and tools so that performance would no longer be an issue that stands in the way of using object-oriented languages to implement performance-critical systems. Furthermore, we hope that the examples and patterns discovered by this dissertation can be used to raise the awareness of bloat in real-world software development—developers should understand the performance impact of their decisions and should try to avoid these bloat patterns in order to have high-performance implementations.