ISO/IEC 8859-1 - Similar Character Sets

Similar Character Sets

ISO-8859-1 was incorporated as the first 256 code points of ISO/IEC 10646 and Unicode.

The lower range 32 to 126 (hex 20 to 7E, the G0 subset) maps exactly to the same coded G0 subset of the ISO 646 US variant (commonly known as ASCII), whose ISO 2022 standard switch sequence is "ESC ( B". The higher range 160 to 255 (hex A0 to FF, the G1 subset) maps exactly to the same subset initiated by the ISO 2022 standard switch sequence "ESC . A".

ISO/IEC 8859-1 is missing some characters for French and Finnish text and the euro sign. In order to provide some of these characters, ISO/IEC 8859-15 was developed as an update of ISO/IEC 8859-1. This required, however, the removal of some infrequently used characters from ISO/IEC 8859-1, including fraction symbols and letter-free diacritics: ¤, ¦, ¨, ´, ¸, ¼, ½, and ¾.

The popular Windows-1252 character set adds all the missing characters provided by ISO/IEC 8859-15, plus a number of typographic symbols, by replacing the rarely used C1 controls in the range 128 to 159 (hex 80 to 9F). It is very common to mislabel text data with the charset label ISO-8859-1, even though the data is really Windows-1252 encoded. Many web browsers and e-mail clients will interpret ISO-8859-1 control codes as Windows-1252 characters in order to accommodate such mislabeling but it is not standard behaviour and care should be taken to avoid generating these characters in ISO-8859-1 labeled content.

The Apple Macintosh computer introduced a character encoding called Mac Roman, or Mac-Roman, in 1984. It was meant to be suitable for Western European desktop publishing. It is a superset of ASCII, like ISO-8859-1, and has most of the characters that are in ISO-8859-1 but in a totally different arrangement. A later version, registered with IANA as "Macintosh", replaced the generic currency sign ¤ with the euro sign €. The few printable characters that are in ISO 8859-1 but not in this set are often a source of trouble when editing text on websites using older Macintosh browsers (including the last version of Internet Explorer for Mac). However the extra characters that Windows-1252 has in the C1 codepoint range are all supported in MacRoman.

DOS had code page 850, which had all printable characters that ISO-8859-1 had (albeit in a totally different arrangement) plus the most widely used graphic characters from code page 437.

Read more about this topic:  ISO/IEC 8859-1

Famous quotes containing the words similar, character and/or sets:

    The great charm of poetry consists in lively pictures of the sublime passions, magnanimity, courage, disdain of fortune; or those of the tender affections, love and friendship; which warm the heart, and diffuse over it similar sentiments and emotions.
    David Hume (1711–1776)

    Eccentricity: strength of character doubling back on itself.
    Mason Cooley (b. 1927)

    Drink, sir, is a great provoker of three things ... nose-painting, sleep, and urine. Lechery, sir, it provokes and unprovokes: it provokes the desire but it takes away the performance. Therefore much drink may be said to be an equivocator with lechery: it makes him and it mars him; it sets him on and it takes him off.
    William Shakespeare (1564–1616)