Han Unification - Rationale and Controversy

Rationale and Controversy

Rules for Han unification are given in the East Asian Scripts chapter of the various versions of the Unicode Standard (Chapter 12 in Unicode 6.0). The Ideographic Rapporteur Group (IRG), made up of experts from the Chinese-speaking countries, North and South Korea, Japan, Vietnam, and other countries, is responsible for the process.

One possible rationale is the desire to limit the size of the full Unicode character set, where CJK characters as represented by discrete ideograms may approach or exceed 100,000 (while those required for ordinary literacy in any language are probably under 3,000). Version 1 of Unicode was designed to fit into 16 bits and only 20,940 characters (32%) out of the possible 65,536 were reserved for these CJK Unified Ideographs. Later Unicode has been extended to 21 bits allowing many more CJK characters (75,960 are assigned, with room for more).

The secret life of Unicode article located on IBM DeveloperWorks attempts to illustrate part of the motivation for Han unification:

The problem stems from the fact that Unicode encodes characters rather than "glyphs," which are the visual representations of the characters. There are four basic traditions for East Asian character shapes: traditional Chinese, simplified Chinese, Japanese, and Korean. While the Han root character may be the same for CJK languages, the glyphs in common use for the same characters may not be, and new characters were invented in each country. For example, the traditional Chinese glyph for "grass" uses four strokes for the "grass" radical 艹, whereas the simplified Chinese, Japanese, and Korean glyphs use three. But there is only one Unicode point for the grass character (U+8349) regardless of writing system. Another example is the ideograph for "one" (壹, 壱, or 一), which is different in Chinese, Japanese, and Korean. Many people think that the three versions should be encoded differently.

In fact, the three ideographs for "one" are encoded separately in Unicode, as they are not considered national variants. The first and second are used on financial instruments to prevent tampering (they may be considered variants), while the third is the common form in all three countries.

However, Han unification has also caused considerable controversy, particularly among the Japanese public, who, with the nation's literati, have a history of protesting the culling of historically and culturally significant variants. (See Kanji#Orthographic reform and lists of kanji. Today, the list of characters officially recognized for use in proper names continues to expand at a modest pace.)

Read more about this topic:  Han Unification

Famous quotes containing the word controversy:

    And therefore, as when there is a controversy in an account, the parties must by their own accord, set up for right Reason, the Reason of some Arbitrator, or Judge, to whose sentence, they will both stand, or their controversy must either come to blows, or be undecided, for want of a right Reason constituted by Nature; so is it also in all debates of what kind soever.
    Thomas Hobbes (1579–1688)