Internet Media Type - Limitations

Limitations

Internet media types are often used as part of a communication protocol between two applications (the source and destination). In this context, internet media type specifiers experience several problems.

The first problem is the ability of the source application (i.e. web server, email client) to correctly determine an internet media type for a piece of content. Many applications attempt to heuristically classify a file using its filename extension or with magic numbers. Neither approach is perfect, and may incorrectly classify a content's media type:

  • Incorrect filename extension: a filename extension classifier will report an incorrect media type. For instance, some applications incorrectly give Rich text format files the .doc file extensions, instead of the correct .rtf extension.
  • No filename extension: a filename extension classifier will report no media type, or will (incorrectly) report a catch-all type such as application/octet-stream. Files without extension are common on unix systems.
  • Filename extension collisions: when multiple formats use the same filename extension, a filename extension classifier will choose one media type arbitrarily. For instance, both Microsoft Word templates and graphviz graph files use the extension .dot.
  • Ambiguous container formats: a magic number classifier can may give an correct, though non-specific, media type, thus preventing a meaningful interpretation of the content. For instance, Office Open XML (.docx) format and Java executable (.jar) are both implemented internally as a zipped archive. A magic number system may classify such files as application/zip instead of the more specific type. Similar problems occur between XML and application formats implemented on top of XML.
  • Ambiguous magic numbers: an attacker can create a file which is identified simultaneously as two separate internet media types. For instance, the internal structure of a Gifar makes it both a valid GIF image and Java executable.

The second problem is the destination application's ability to trust the internet media type reported by the sender. As above, the internet media type is incorrect in some circumstances, and must be treated with skepticism. As early as 2002, the W3C unambiguously warned that it is a "serious error" if internet media type is incorrect, and that software should not attempt to guess a correct media type. Nonetheless, software engineering principles encourage software that forgives a certain degree of malformed input, and user experience suffers when software fails to correctly interpret the content. Consequently, the many destination applications are designed to attempt recovery from such errors and identify a correct media type.

The destination application has no more knowledge of the content than the source application, and attempts to infer the media type at the destination are equally difficult. This can lead to incompatibilities between source and destination applications, and in the worst-case, security vulnerabilities such as the Gifar attack or Cross-site scripting attacks. Advanced content sniffing approaches have been proposed to balance interoperability and security in such situations.

Read more about this topic:  Internet Media Type

Famous quotes containing the word limitations:

    No man could bring himself to reveal his true character, and, above all, his true limitations as a citizen and a Christian, his true meannesses, his true imbecilities, to his friends, or even to his wife. Honest autobiography is therefore a contradiction in terms: the moment a man considers himself, even in petto, he tries to gild and fresco himself.
    —H.L. (Henry Lewis)

    Growing up means letting go of the dearest megalomaniacal dreams of our childhood. Growing up means knowing they can’t be fulfilled. Growing up means gaining the wisdom and skills to get what we want within the limitations imposed by reality—a reality which consists of diminished powers, restricted freedoms and, with the people we love, imperfect connections.
    Judith Viorst (20th century)

    The only rules comedy can tolerate are those of taste, and the only limitations those of libel.
    James Thurber (1894–1961)