Vector Space Model - Limitations

Limitations

The vector space model has the following limitations:

  1. Long documents are poorly represented because they have poor similarity values (a small scalar product and a large dimensionality)
  2. Search keywords must precisely match document terms; word substrings might result in a "false positive match"
  3. Semantic sensitivity; documents with similar context but different term vocabulary won't be associated, resulting in a "false negative match".
  4. The order in which the terms appear in the document is lost in the vector space representation.
  5. Assumes terms are statistically independent
  6. Weighting is intuitive but not very formal

Many of these difficulties can, however, be overcome by the integration of various tools, including mathematical techniques such as singular value decomposition and lexical databases such as WordNet.

Read more about this topic:  Vector Space Model

Famous quotes containing the word limitations:

    ... art transcends its limitations only by staying within them.
    Flannery O’Connor (1925–1964)

    To note an artist’s limitations is but to define his talent. A reporter can write equally well about everything that is presented to his view, but a creative writer can do his best only with what lies within the range and character of his deepest sympathies.
    Willa Cather (1876–1947)

    That all may be so, but when I begin to exercise that power I am not conscious of the power, but only of the limitations imposed on me.
    William Howard Taft (1857–1930)