Definitions
Documents and queries are represented as vectors.
Each dimension corresponds to a separate term. If a term occurs in the document, its value in the vector is non-zero. Several different ways of computing these values, also known as (term) weights, have been developed. One of the best known schemes is tf-idf weighting (see the example below).
The definition of term depends on the application. Typically terms are single words, keywords, or longer phrases. If the words are chosen to be the terms, the dimensionality of the vector is the number of words in the vocabulary (the number of distinct words occurring in the corpus).
Vector operations can be used to compare documents with queries.
Read more about this topic: Vector Space Model
Famous quotes containing the word definitions:
“What I do not like about our definitions of genius is that there is in them nothing of the day of judgment, nothing of resounding through eternity and nothing of the footsteps of the Almighty.”
—G.C. (Georg Christoph)
“Lord Byron is an exceedingly interesting person, and as such is it not to be regretted that he is a slave to the vilest and most vulgar prejudices, and as mad as the winds?
There have been many definitions of beauty in art. What is it? Beauty is what the untrained eyes consider abominable.”
—Edmond De Goncourt (18221896)