Character Encoding - Character Encoding Translation

Character Encoding Translation

As a result of having many character encoding methods in use (and the need for backward compatibility with archived data), many computer programs have been developed to translate data between encoding schemes. Some of these are cited below.

Cross-platform:

  • Web browsers – most modern web browsers feature automatic character encoding detection. On Firefox 3, for example, see the View/Character Encoding submenu.
  • iconv – program and standardized API to convert encodings
  • convert_encoding.py – Python based utility to convert text files between arbitrary encodings and line endings.
  • decodeh.py – algorithm and module to heuristically guess the encoding of a string.
  • International Components for Unicode – A set of C and Java libraries to perform charset conversion. uconv can be used from ICU4C.
  • chardet – This is a translation of the Mozilla automatic-encoding-detection code into the Python computer language.
  • The newer versions of the unix File command attempt to do a basic detection of character encoding. (also available on cygwin and mac)

Linux:

  • cmv - simple tool for transcoding filenames.
  • convmv – convert a filename from one encoding to another.
  • cstocs – convert file contents from one encoding to another
  • enca – analyzes encodings for given text files.
  • recode – convert file contents from one encoding to another
  • utrac – convert file contents from one encoding to another.

Windows:

  • Encoding.Convert – .NET API
  • MultiByteToWideChar/WideCharToMultiByte – Convert from ANSI to Unicode & Unicode to ANSI
  • cscvt – character set conversion tool
  • enca – analyzes encodings for given text files.

Read more about this topic:  Character Encoding

Famous quotes containing the words character and/or translation:

    We imagined that the sun shining on their bare heads had stamped a liberal and public character on their most private thoughts.
    Henry David Thoreau (1817–1862)

    To translate, one must have a style of his own, for otherwise the translation will have no rhythm or nuance, which come from the process of artistically thinking through and molding the sentences; they cannot be reconstituted by piecemeal imitation. The problem of translation is to retreat to a simpler tenor of one’s own style and creatively adjust this to one’s author.
    Paul Goodman (1911–1972)