Clustering and Relevance
The cluster hypothesis, proposed by C. J. van Rijsbergen in 1979, asserts that two documents that are similar to each other have a high likelihood of being relevant to the same information need. With respect to the embedding similarity space, the cluster hypothesis can be interpreted globally or locally. The global interpretation assumes that there exist some fixed set of underlying topics derived from inter-document similarity. These global clusters or their representatives can then be used to relate relevance of two documents (e.g. two documents in the same cluster should both be relevant to the same request). Methods in this spirit include,
- cluster-based information retrieval
- cluster-based document expansion such as latent semantic analysis or its language modeling equivalents. It is important to ensure that clusters – either in isolation or combination – successfully model the set of possible relevant documents.
A second interpretation, most notably advanced by Ellen Voorhees, focuses on the local relationships between documents. The local interpretation avoids having to model the number or size of clusters in the collection and allow relevance at multiple scales. Methods in this spirit include,
- multiple cluster retrieval
- spreading activation and relevance propagation methods
- local document expansion
- score regularization
Local methods require an accurate and appropriate document similarity measure.
Read more about this topic: Relevance (information Retrieval)
Famous quotes containing the word relevance:
“... whatever men do or know or experience can make sense only to the extent that it can be spoken about. There may be truths beyond speech, and they may be of great relevance to man in the singular, that is, to man in so far as he is not a political being, whatever else he may be. Men in the plural, that is, men in so far as they live and move and act in this world, can experience meaningfulness only because they can talk with and make sense to each other and to themselves.”
—Hannah Arendt (19061975)