Web Archiving - Aspects of Web Curation

Aspects of Web Curation

Web curation, like any digital curation, entails:

  • Certification of the trustworthiness and integrity of the collection content
  • Collecting verifiable Web assets
  • Providing Web asset search and retrieval
  • Semantic and ontological continuity and comparability of the collection content

Thus, besides the discussion on methods of collecting the Web, those of providing access, certification, and organizing must be included. There are a set of popular tools that addresses these curation steps:

A suite of tools for Web Curation by International Internet Preservation Consortium:

  • Heritrix - official website - collecting Web asset
  • NutchWAX - search Web archive collections
  • Wayback (Open source Wayback Machine) - search and navigate Web archive collections using NutchWax
  • Web Curator Tool - Selection and Management of Web Collection

Other open source tools for manipulating web archives:

  • WARC Tools - for creating, reading, parsing and manipulating, web archives programmatically
  • Search Tools - for indexing and searching full-text and metadata within web archives

Read more about this topic:  Web Archiving

Famous quotes containing the words aspects of, aspects and/or web:

    ... of all the aspects of social misery nothing is so heartbreaking as unemployment ...
    Jane Addams (1860–1935)

    ... of all the aspects of social misery nothing is so heartbreaking as unemployment ...
    Jane Addams (1860–1935)

    Thou blind man’s mark, thou fool’s self-chosen snare,
    Fond Fancy’s scum and dregs of scattered thought,
    Band of all evils, cradle of causeless care,
    Thou web of will whose end is never wrought;
    Desire! desire, I have too dearly bought
    With price of mangled mind thy worthless ware;
    Sir Philip Sidney (1554–1586)