Search ETDs:
Multiple-Instance Learning from Distributions
Doran, Gary Brian, Jr.

2015, Doctor of Philosophy, Case Western Reserve University, EECS - Computer and Information Sciences.
I propose a new theoretical framework for analyzing the multiple-instance learning (MIL) setting. In MIL, training examples are provided to a learning algorithm in the form of labeled sets, or "bags," of instances. Applications of MIL include 3-D quantitative structure-activity relationship prediction for drug discovery and content-based image retrieval for web search. The goal of an algorithm is to learn a function that correctly labels new bags or a function that correctly labels new instances. I propose that bags should be treated as latent distributions from which samples are observed. I show that it is possible to learn accurate instance- and bag-labeling functions in this setting as well as functions that correctly rank bags or instances under weak assumptions. Additionally, my theoretical results suggest that it is possible learn to rank efficiently using traditional, well-studied "supervised" learning approaches. These results also indicate that supervised approaches for learning from distributions can be used to directly learn bag-labeling functions efficiently. I perform an extensive empirical evaluation that supports the theoretical predictions entailed by the new framework. In addition to showing how supervised approaches can be applied to MIL, I prove new hardness results on using MI-specific algorithms to learn hyperplane labeling functions for instances. Finally, I propose a new resampling approach for MIL, analyze it under the new theoretical framework, and show that it can improve the performance of MI classifiers when training set sizes are small. In summary, the proposed theoretical framework leads to a better understanding of the relationship between the MI and standard supervised learning settings, and it provides new methods for learning from MI data that are more accurate, more efficient, and have better understood theoretical properties than existing MI-specific algorithms.
Soumya Ray (Advisor)
Harold Connamacher (Committee Member)
Michael Lewicki (Committee Member)
Stanislaw Szarek (Committee Member)
Kiri Wagstaff (Committee Member)
248 p.

Recommended Citations

Hide/Show APA Citation

Doran, G. (2015). Multiple-Instance Learning from Distributions. (Electronic Thesis or Dissertation). Retrieved from https://etd.ohiolink.edu/

Hide/Show MLA Citation

Doran, Gary. "Multiple-Instance Learning from Distributions." Electronic Thesis or Dissertation. Case Western Reserve University, 2015. OhioLINK Electronic Theses and Dissertations Center. 16 Dec 2017.

Hide/Show Chicago Citation

Doran, Gary "Multiple-Instance Learning from Distributions." Electronic Thesis or Dissertation. Case Western Reserve University, 2015. https://etd.ohiolink.edu/

Files

gdoran_dissertation.pdf (2.86 MB) View|Download