Vector Space Model - Limitations

Limitations

The vector space model has the following limitations:

  1. Long documents are poorly represented because they have poor similarity values (a small scalar product and a large dimensionality)
  2. Search keywords must precisely match document terms; word substrings might result in a "false positive match"
  3. Semantic sensitivity; documents with similar context but different term vocabulary won't be associated, resulting in a "false negative match".
  4. The order in which the terms appear in the document is lost in the vector space representation.
  5. Assumes terms are statistically independent
  6. Weighting is intuitive but not very formal

Many of these difficulties can, however, be overcome by the integration of various tools, including mathematical techniques such as singular value decomposition and lexical databases such as WordNet.

Read more about this topic:  Vector Space Model

Famous quotes containing the word limitations:

    The limitations of pleasure cannot be overcome by more pleasure.
    Mason Cooley (b. 1927)

    The motion picture made in Hollywood, if it is to create art at all, must do so within such strangling limitations of subject and treatment that it is a blind wonder it ever achieves any distinction beyond the purely mechanical slickness of a glass and chromium bathroom.
    Raymond Chandler (1888–1959)