*nix Documentation Project
·  Home
 +   man pages
·  Linux HOWTOs
·  FreeBSD Tips
·  *niX Forums

  man pages->Tru64 Unix man pages -> gb18030 (5)              
Title
Content
Arch
Section
 

GB18030(5)

Contents


NAME    [Toc]    [Back]

       GB18030,  gb18030  -  A Chinese character set that extends
       GBK by means of 4-byte code points

DESCRIPTION    [Toc]    [Back]

       The GB18030-2000 character set,  defined  by  the  Chinese
       national standard organization, is an extension of the GBK
       character  set,  which  itself  is  an  extension  to  the
       GB2312-80  character set. (See the GBK(5) reference page.)

       GB18030 incorporates GBK support for all the Hanzi characters
  specified  by  the  Unicode  Version 3.0 and ISO/IEC
       10646-2001 standards.

   GB18030 Code Space and Code Points    [Toc]    [Back]
       The GB18030 character set has 1-byte, 2-byte,  and  4-byte
       encoding with the following structure:

       -----------------------------------------------------------------
       Number of Bytes   Code Space                   Total Code Points
       -----------------------------------------------------------------
       1-byte            0x00 to 0x7F                 128
       2-byte            0x81 to 0xFE                 23940
                         0x40 to 0xFE (except 0x7F)
       4-byte            0x81 to 0xFE                 1587600
                         0x30 to 0x39
                         0x81 to 0xFE
                         0x30 to 0x39
       -----------------------------------------------------------------

       The  GB18030  1-byte  code provides support for ASCII. The
       2-byte code provides support for all  the  CJK  characters
       (Chinese, Japanese, and Korean) defined in the Unicode 2.1
       standard. The 4-byte code provides support for the Unicode
       Version 3.0 additions to Version 2.1. The 4-byte code also
       leaves a large number of unassigned  codepoints  that  are
       available for future use.

       The  GB18030  character set maps the invalid Unicode codepoints
 U+FFFE and U+FFFF to 4-byte  codes.  Because  these
       two  characters are invalid in UCS, this mapping can cause
       problems with round-trip character conversions.

       The GB18030 character set does no mapping from 4-byte code
       to the UCS surrogate area (U+D800 through U+DFFF).

   Codeset Converters for GB18030    [Toc]    [Back]
       The  following  codeset  converter pairs are available for
       converting Simplified Chinese characters  between  GB18030
       and  UCS formats. Refer to Unicode(5) for more information
       about the UTF-16, UCS-4, and UTF-8 encoding formats. Refer
       to  iconv_intro(5)  for an introduction to codeset conversion.
  UTF-16_GB18030, GB18030_UTF-16

              Converting from and to UTF-16 format UCS-4_GB18030,
              GB18030_UCS-4

              Converting  from and to UCS-4 format UTF-8_GB18030,
              GB18030_UTF-8

              Converting from and to UTF-8 format


   Fonts for GB18030    [Toc]    [Back]
       The operating system  provides  the  following  Simplified
       Chinese TrueType fonts for GB18030: -css_dongwen-fangsongmedium-r-normal--0-0-0-0-c-0-iso8859-1
  -css_dongwen-fangsong-medium-r-normal--0-0-0-0-c-0-iso10646-1


       -css_dongwen-heiti-medium-r-normal--0-0-0-0-c-0-iso8859-1
       -css_dongwen-heiti-medium-r-normal--0-0-0-0-c-0-iso10646-1

       -css_dongwen-kaiti-medium-r-normal--0-0-0-0-c-0-iso8859-1
       -css_dongwen-kaiti-medium-r-normal--0-0-0-0-c-0-iso10646-1

       -css_dongwen-songti-medium-r-normal--0-0-0-0-c-0-iso8859-1
       -css_dongwen-songti-medium-r-normal--0-0-0-0-c-0-iso10646-1




       These  fonts  can  be  used for printing with Chinese text
       printers. The operating system uses Unicode fonts and  the
       SongTi  font  style  as  the  default  screen font for the
       GB18030 codeset. See  wwpsof(8)  for  information  on  the
       PostScript print filter and TrueType fonts.

SEE ALSO    [Toc]    [Back]

      
      
       Commands: locale(1)

       Others:   ascii(5),   big5(5),   Chinese(5),  dechanyu(5),
       dechanzi(5), eucTW(5), GBK(5), i18n_intro(5),  i18n_printing(5), l10n_intro(5), sbig5(5), telecode(5)



                                                       GB18030(5)
[ Back ]
 Similar pages
Name OS Title
dechanzi Tru64 A character encoding system (codeset) for Simplified Chinese
big5 Tru64 A character encoding system (codeset) for Traditional Chinese
telecode Tru64 A character encoding system (codeset) for Traditional Chinese
sbig5 Tru64 A character encoding system (codeset) for Traditional Chinese
dechanyu Tru64 A character encoding system (codeset) for Traditional Chinese
eucTW Tru64 A character encoding system (codeset) for Traditional Chinese
GBK Tru64 A character encoding system (codeset) for Simplified Chinese
gbk Tru64 A character encoding system (codeset) for Simplified Chinese
dxim Tru64 A multilingual input server for Simplified Chinese, Traditional Chinese, Korean, and Phrase input me...
extendfs Tru64 Extends UFS file systems
Copyright © 2004-2005 DeniX Solutions SRL
newsletter delivery service