Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
Thesis-revised.pdf (1.27 MB)
ETD Abstract Container
Abstract Header
Long Document Understanding using Hierarchical Self Attention Networks
Author Info
Kekuda, Akshay
ORCID® Identifier
http://orcid.org/0000-0001-6124-2319
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=osu1669745538050042
Abstract Details
Year and Degree
2022, Master of Science, Ohio State University, Computer Science and Engineering.
Abstract
Natural Language Processing techniques are being widely used in the industry these days to solve a variety of business problems. In this work, we experiment with the application of NLP techniques for the use case of understanding the call interactions between customers and customer service representatives and to extract interesting insights from these conversations. We focus our methodologies on understanding call transcripts of these interactions which fall under the category of long document understanding. Existing works in text encoding typically address short form text encoding. Deep Learning models like Vanilla Transformer, BERT and DistilBERT have achieved state of the art performance on a variety of tasks involving short form text but perform poorly on long documents. To address this issue, modifications to the Transformer model have been released in the form of Longformer and BigBird. However, all these models require heavy computational resources which are often unavailable in small scale companies that run on budget constraints. To address these concerns, we survey a variety of explainable and light weight text encoders that can be trained easily in a resource constrained environment. We also propose Hierarchical Self Attention based models that outperform DistilBERT, Doc2Vec and single layer self-attention networks for downstream tasks like text classification. The proposed architecture has been put into production at the local industry organization that sponsored the research (SafeAuto Inc.) and helps the company to monitor the performance of its customer service representatives.
Committee
Eric Fosler-Lussier (Committee Chair)
Rajiv Ramnath (Advisor)
Pages
71 p.
Subject Headings
Artificial Intelligence
;
Computer Science
Keywords
NLP, Attention Networks
;
BERT
;
Transformer
;
Long Document
;
LSTM
;
RNN
;
Self Attention
;
Hierarchical Self Attention
;
Call Transcripts
;
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Kekuda, A. (2022).
Long Document Understanding using Hierarchical Self Attention Networks
[Master's thesis, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1669745538050042
APA Style (7th edition)
Kekuda, Akshay.
Long Document Understanding using Hierarchical Self Attention Networks.
2022. Ohio State University, Master's thesis.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=osu1669745538050042.
MLA Style (8th edition)
Kekuda, Akshay. "Long Document Understanding using Hierarchical Self Attention Networks." Master's thesis, Ohio State University, 2022. http://rave.ohiolink.edu/etdc/view?acc_num=osu1669745538050042
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
osu1669745538050042
Download Count:
106
Copyright Info
© 2022, all rights reserved.
This open access ETD is published by The Ohio State University and OhioLINK.