Department: Management Science ![Remove this limiter [clear]](close-x.png)
One match in the database.

1.
Elkalifa, Elsuni Sidahmed.
The effect of collection homogeneity on term association as a method of request expansion in information retrieval.
Degree: PhD, Management Science, 1991, Case Western Reserve University
► Statistical techniques have been proposed as alternatives to traditional methods of request…
(more)
▼ Statistical techniques have been proposed as alternatives to traditional methods of request expansion or feedback mechanisms. These statistical measures are derived from formulas which attempt to correlate two given index terms on the basis of their frequency of co-occurrence in the documents of a given collection. These techniques attempt to relax the retrieval requirement that the request terms should exactly match the document descriptors before the documents can be judged relevant to the request. Though simple the concept seems to be, the complexity of the natural language and the irregularities that govern the syntactic and sematic structure make the application of such techniques rather complicated. Due to this most of the previous investigations failed to produce any efficient alternatives to traditional information retrieval systems. A major problem is false or spurious association between semantically and conceptually independent terms. It is believed that the failure of these studies is mainly due to the heterogeneity of the collections used rather than to the inefficiency of the techniques themselves. A combination of two techniques is used to create a more powerful request expansion technique. Cluster analysis techniques are used to subdivid e the document collection into small more homogeneous collections; then term association techniques are applied to determine which terms could be used to expand the original request. A method used to compute the degree of association between original request terms and document descriptors is based on the formula, RJ1 = N(WjW1)over N(Wj)+N(w1)-N(WjW1)where: R j1 is the coefficient of association between term W j and term W1; N(WPVjPVW1) = number of documents in which both term W j and term W1 appeared; N(W j) = number of documents in which term W j occurred; N(W1) = number of documents in which term W1 occurred. The document file consisted of the significant words in the titles, abstracts, and identifiers of 150 documents. Three search strategies were formulated for each request: the first consisted of the original search terms, the second included terms extracted from the entire collection while the third consisted of a combination of terms extracted from specific clusters and the original request terms. Results indicate that statistical term association techniques are effective methods of request expansion. (Abstract shortened with permission of author.
Advisors/Committee Members: Saracevic, Tefko.
Subjects: Information Science
Keywords: collection homogeneity term association request expansion information retrieval
More Like This