Skip to Main Content
 

Global Search Box

 
 
 
 

Files

ETD Abstract Container

Abstract Header

Performance Evaluation of Analytical Queries on a Stand-alone and Sharded Document Store

Raghavendra, Aarthi

Abstract Details

2015, MS, University of Cincinnati, Engineering and Applied Science: Computer Science.
Numerous organizations perform data analytics using relational databases by executing data mining queries. These queries include complex joins and aggregate functions. However, due to an explosion of data in terms of volume, variety, veracity, velocity, and value, known as Big Data [1], many organizations such as Foursquare, Adobe, and Bosch have migrated to NoSQL databases [2] such as MongoDB [3] and Cassandra [4]. We intend to demonstrate the performance impact an organization can expect for analytical queries on a NoSQL document store. In this thesis, we benchmark the performance of MongoDB [3], a cross-platform document-oriented database for datasets of sizes 1GB and 5GB in a stand-alone environment and a sharded environment. The stand-alone MongoDB environment for all the datasets is the same whereas the configurations of the MongoDB cluster vary based on the dataset size. The TPC-DS benchmark [5] is used to generate data of different scales and selected data mining queries are executed in both the environments. Our experimental results show that along with choosing the environment, data modeling in MongoDB also has a significant impact on query execution times. MongoDB is an appropriate choice when the data has a flexible structure and analytical query performance is best when data is stored in a denormalized fashion. When the data is sharded, due to multiple query predicates in an analytical query, aggregating data from a few or all nodes proves to be an expensive process and hence performs poorly when compared to the alternative process of executing the same in a stand-alone environment.
Karen Davis, Ph.D. (Committee Chair)
Raj Bhatnagar, Ph.D. (Committee Member)
Paul Talaga, Ph.D. (Committee Member)
73 p.

Recommended Citations

Citations

  • Raghavendra, A. (2015). Performance Evaluation of Analytical Queries on a Stand-alone and Sharded Document Store [Master's thesis, University of Cincinnati]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1447688210

    APA Style (7th edition)

  • Raghavendra, Aarthi. Performance Evaluation of Analytical Queries on a Stand-alone and Sharded Document Store. 2015. University of Cincinnati, Master's thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=ucin1447688210.

    MLA Style (8th edition)

  • Raghavendra, Aarthi. "Performance Evaluation of Analytical Queries on a Stand-alone and Sharded Document Store." Master's thesis, University of Cincinnati, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1447688210

    Chicago Manual of Style (17th edition)