Search Results (1 - 4 of 4 Results)

Sort By  
Sort Dir
 
Results per page  

Subramanian, NanditaAnalysis of Rank Distance for Malware Classification
MS, University of Cincinnati, 2016, Engineering and Applied Science: Computer Science
Malicious Cyber Adversaries may compromise the security of a system by denying access to legitimate users. This is often coupled with immeasurable loss of confidential data, which leads to hefty losses in both financial and trustworthiness aspects of a corporation. Malware exploits key vulnerabilities in applications presenting problems such as identity theft, unapproved software installations, etc. Abundance in malware detection and removal techniques in the ever evolving field of computers, presently exhibit a lower level of efficiency in detecting malicious softwares. Techniques available currently enable detection of softwares that are embedded with known signatures. No doubt these methods are efficient. However, most malware writers, aware of signature-based detection methods are working towards bypassing them. Machine learning based systems for malware classification and detection have been tested and proved to be more efficient than standard signature-based systems. A vital reason and justification providing a strong foothold for using machine learning techniques is that even unseen malware can be detected, thus eliminating malware detection failures and providing very high success rates. Our method uses efficient machine learning techniques for classification and detection of portable executable (PE) files of various malware classes commonly found in computers running Windows operating systems. For malicious files, computation of the distance between two files should yield an indication of their similarity. Using this as a basis, this thesis analyses the different approaches which can be employed for classifying malicious files using a method known as rank distance. This distance measure has been combined with a feature extraction method known as mutual information which analyses the opcodes n-gram sequences extracted from the PE files and segregates the most relevant opcodes from these. The most relevant opcodes, thus obtained, are used as features to identify which class a given file belongs to. An opcode relevance profile generated based on mutual information and the unclassified file are compared and assigned the respective rank distances for every class. Using these ranks, a distance between the two files is obtained. The class which has the least distance to the file is concluded to be the class of the file under scrutiny.

Committee:

Anca Ralescu, Ph.D. (Committee Chair); Chia Han, Ph.D. (Committee Member); Dan Ralescu, Ph.D. (Committee Member)

Subjects:

Computer Science

Keywords:

Rank Distance;Malware Classification;Mutual Information;Text Mining;Similarity Measures;Windows Malware

Bani-Ahmad, Sulieman AhmadRESEARCH-PYRAMID BASED SEARCH TOOLS FOR ONLINE DIGITAL LIBRARIES
Doctor of Philosophy, Case Western Reserve University, 2008, Computing and Information Science

In any online literature digital library, findability precedes usability: users cannot use what theycannot find. Four research directions that support better findability in digital libraries are

(a) Accurate scoring functions to assign importance/prestige scores to publications,

(b) Accurate similarity measures for publications to locate publications similar to a given publication

(c) Accurate ranking measures to order search results based on their importance and relevance to users interests, and

(d) Helping users develop search keywords that lead to successful searches. The contributions of this thesis are as follows.

1. Propose and comparatively evaluate score functions for publications, authors, and publication venues, as well as similarity measures for publications, towards research direction items a and b.

2. Validate a new model for the evolution of research and citation behavior, namely, the Research Pyramid Model. Then, propose and evaluate two algorithms for identifying research pyramid structures in publication citation graphs, and for research-pyramid-based publication score generation, towards research direction item a.

3. Propose and evaluate a citation-based publication popularity growth and decay model, towards research direction item c.

4. Using the Research Pyramid Model and the identified research pyramid structures, develop two literature digital library searching and ranking tools:

a. A Research-Pyramid-based ranking tool that assigns accurate scores for publications, and

b. A scalable content-driven Search-Keyword Suggester that helps users to put together query search terms effectively.

Committee:

Gultekin Ozsoyoglu, Prof (Advisor); Mehmet Koyuturk, Asst. Prof. (Committee Member); G. Q. Zhang, Assoc. Prof. (Committee Member); Frank Merat, Prof. (Committee Member); Meral Ozsoyoglu, Prof. (Committee Member)

Subjects:

Computer Science

Keywords:

Literature Digital Libraries; the research-pyramid model; publication score and similarity measures.

Goyal, VivekA Recommendation System Based on Multiple Databases.
MS, University of Cincinnati, 2013, Engineering and Applied Science: Computer Science
Recommendation Systems have long been serving the industry of e-commerce with recommendations pertaining to movies, books, travel packages et cetera. A user's activity or past history of purchases is used to generate predictions for that user. Youtube's video recommendation system, Amazon's "You may also like..." and Pandora's music recommendation system are a few very popular examples. Both explicit and implicit feedbacks are being utilized to churn out predictions about the likings of a customer to recommend items. As recommendation systems have evolved, we primarily encounter two types- Content based and Collaborative Filtering based recommendation systems. Content based recommendation systems are designed to recommend items similar to the one a user has liked in the past. Recommendation systems based on collaborative filtering recommend items liked by similar users. Users who have liked similar items are identified and items highly liked by those users are recommended. For both content based and collaborative filtering based recommendation systems to predict a rating, it is essential to establish a similarity between items. We have explored correlation and clustering to establish similarity. It was observed that correlation captured similarity better than done by clustering alone. With an intuition that clustering items into similar groups and then employing correlation to determine similarities could improve predictions, we developed an algorithm which is a combination of clustering and correlation that eventually generates prediction for an item rating. We have experimented with adding contextual information to generate better predictions. Our results suggest that predictions generated by using clustering alone got improved by substituting it with correlation. Further, it was seen that a combination of both improved the predictions over clustering alone but correlation still delivered the best results overall. It was established that bringing in more information may not always help. In this thesis we compare these three algorithms and present our analysis with results.

Committee:

Raj Bhatnagar, Ph.D. (Committee Chair); Prabir Bhattacharya, Ph.D. (Committee Member); Karen Davis, Ph.D. (Committee Member)

Subjects:

Computer Science

Keywords:

Collaborative Filtering;Similarity measures;Recommendation System;Neighborhood Model;Fuzzy Clustering;Data Mining;

Yu, XinranMathematical and Experimental Investigation of Ontological Similarity Measures and Their Use in Biomedical Domains
Master of Computer Science, Miami University, 2010, Computer Science and Systems Analysis
Similarity measurement is an important notion. In the context of ontologies, similarity measures are used to determine how similar one concept is to another. Because graph models have been used to represent ontologies, a variety of algorithms have been proposed for calculating the similarity between the graph nodes which represent ontological concepts. This thesis overviews existing ontological similarity measures and investigates mathematically and experimentally a wide range of these measures. The objective is not to assess performance to a gold-standard of similarity judgment but to develop a better understanding of the relationships among these measures through comparing their results when applied to the Gene Ontology. The experimental results show that some ontological similarity measures, especially information content-based measures, are highly correlated. The results of experiments comparing corpus-based to ontology-based information content measures for the Gene Ontology support previous experimental results using WordNet which demonstrated little difference between the two approaches.

Committee:

Valerie Cross, PhD (Advisor); Alton Sanders, PhD (Committee Member); Eric Bachmann, PhD (Committee Member)

Subjects:

Computer Science

Keywords:

ontology; ontological similarity measures; Gene Ontology; information content