eucset - HP-UX

· Home

+ man pages

-> Linux

-> FreeBSD

-> OpenBSD

-> NetBSD

-> Tru64 Unix

-> HP-UX 11i

-> IRIX

· Linux HOWTOs

· FreeBSD Tips

· *niX Forums

man pages->HP-UX 11i man pages -> eucset (1)


 eucset(1)                                                         eucset(1)




 NAME    [Toc]    [Back]
      eucset - sets and gets code widths for ldterm

 SYNOPSIS    [Toc]    [Back]
      eucset [-p]

      eucset [ [-c HP15-codeset] or [-c UTF8] or [-c GB18030] or [cswidth] ]

 DESCRIPTION    [Toc]    [Back]
      The eucset command sets or gets (reports) the encoding and display
      widths of the Extended UNIX Code (EUC), and UCS Transformation Format
      (UTF8), and GB18030 characters processed by the current input
      terminal.  EUC is an encoding method for codesets composed of single
      or multiple bytes.  It permits applications and the terminal hardware
      to use the 7-bit US ASCII code and up to three single byte or
      multibyte code sets simultaneously.

      The eucset command without any options, first tries to set the codeset
      to one of the four HP15 codesets. If unsuccessful, 7-bit US ASCII is
      used as the default codeset.  This command must be used to specify any
      other EUC codesets, whether they are single byte or multibyte.  See
      the WARNINGS section, for special warnings on the values of the
      cswidth argument.

      For GB18030 or UTF8 setting, use the -c option.

    Options    [Toc]    [Back]
      The eucset command recognizes the following options and arguments:

           -p        Displays the current settings of the EUC character
                     widths for the terminal.

           -c        Sets the width to one of the four HP15 codesets, UTF8,
                     or GB18030.  The HP15 codesets supported are SJIS,
                     CCDC, GB, and BIG5.

    EUC Code Set Classes    [Toc]    [Back]
      EUC divides codesets into four classes.  Each codeset has two
      characteristics: the number of bytes for encoding the characters in
      the codeset, and the number of display columns to display the
      characters in the codeset.  All characters within a codeset possess
      the same characteristics.

           +  Codeset 0 consists of all 7-bit, single byte ASCII characters.
              The most significant bit of each of these characters is 0
              (zero).  Characters in codeset 0 require one byte for
              encoding, and occupy one display column.  These values are
              fixed for codeset 0 (zero).  The 7-bit US ASCII code is the
              primary EUC codeset, which is available to users without
              direct specification.




 Hewlett-Packard Company            - 1 -   HP-UX 11i Version 2: August 2003






 eucset(1)                                                         eucset(1)




           +  Codeset 1 is a supplementary EUC codeset.  Codeset 1
              characters have an initial byte whose most significant bit is
              1.  Characters in codeset 1 may require more than one byte for
              encoding, and may require more than one display column.  The
              eucset command must be used to set the characteristics for
              codeset 1.

           +  Codesets 2 and 3 are supplementary EUC codesets.  Characters
              in these codesets have an initial byte of SS2 or SS3,
              respectively.  They require more than one byte for encoding,
              and may require more than one display column.  The eucset
              command must be used to set the characteristics for codesets 2
              and 3.

      The cswidth argument in the eucset command line is a character string
      that describes the character widths for codesets 1 through 3.  This
      command does not allow the user to modify the settings for codeset 0.
      The character string is of the following format:

           X1[:Y1],X2[:Y2],X3[:Y3]

      The value X1 is the number of bytes required to encode a character in
      codeset class 1.  Y1 is the number of display columns needed to
      display characters in this class.  X2 is the number of bytes required
      to encode a character in codeset 2, not counting the SS2 byte, and Y2
      is the number of display columns for codeset 2 characters.  X3 is the
      number of bytes needed to encode characters in codeset 3, not counting
      the SS3 byte, and Y3 is the number of display columns required for
      these characters.  The values for the column widths may be omitted if
      they are equal to the number of encoding bytes.  If the encoding value
      of any of the EUC codesets is set to 0 (zero), this indicates that the
      codeset does not exist.  See the WARNINGS section for special warnings
      on the values of the cswidth argument.

      If no cswidth argument is supplied, the eucset command uses the value
      of the CSWIDTH environment variable.  If this variable is not present,
      the following default string is substituted:

           1:1,0:0,0:0

      This default string designates that the environment uses a single byte
      EUC codeset that has characters in the EUC codeset 1 format.  If the
      environment uses a multibyte EUC codeset in the codeset 1 format,
      single byte or multibyte EUC codesets in the codeset 2 or 3 format, or
      both, the default setting cannot be used.

 EXTERNAL INFLUENCES    [Toc]    [Back]
    Environment Variables
      LANG                Provide a default value for the
                          internationalization variables that are unset or
                          null.  If LANG is not specified or is set to the



 Hewlett-Packard Company            - 2 -   HP-UX 11i Version 2: August 2003






 eucset(1)                                                         eucset(1)




                          empty string, a default of C (see lang(5)) is used
                          instead of LANG.  If any of the
                          internationalization variables contain an invalid
                          setting, eucset behaves as if all
                          internationalization variables are set to C.  See
                          environ(5).

      LC_ALL              If set to a nonempty string value, override the
                          values of all other internationalization
                          variables.

      LC_MESSAGES         Determines the locale that should be used to
                          affect the format and contents of diagnostic
                          messages written to standard error and informative
                          messages written to standard output.

      NLSPATH             Determines the location of message catalogs for
                          the processing of LC_MESSAGES.

 EXAMPLES    [Toc]    [Back]
      To display the encoding and display widths for the EUC codesets 1 to 3
      in your environment, enter:

           eucset -p

      Assuming eucset has been previously used to set for ja_JP.eucJP, the
      entry generates the following:

           cswidth 2:2,1:1,2:2

      To change the current settings of the encoding and display widths for
      the EUC characters in codesets 1 and 2 to two bytes each, enter one of
      the following:

           eucset 2:2,2:2,0:0

           eucset 2,2,0

      To set the encoding and display widths for the EUC characters in the
      locale ja_JP.eucJP, enter:

           eucset 2:2,1:1,2:2

      For zh_TW.eucTW, enter:

           eucset 2:2,3:2

      For ko_KR.eucKR, enter:

           eucset 2:2




 Hewlett-Packard Company            - 3 -   HP-UX 11i Version 2: August 2003






 eucset(1)                                                         eucset(1)




      To set the code width to that of UTF8, enter:

           eucset -c UTF8

      To set the code width to that of GB18030, enter:

           eucset -c GB18030

 WARNINGS    [Toc]    [Back]
      The cswidth argument does not include the SS2 or SS3 bytes in the byte
      width values.

      This command is not specified by standards, may not be available on
      other vendor's systems, and may be subject to change or obsolescence
      in a future release.

 AUTHOR    [Toc]    [Back]
      eucset was developed by OSF and HP.

 SEE ALSO    [Toc]    [Back]
      dtterm(1), ldterm(1).


 Hewlett-Packard Company            - 4 -   HP-UX 11i Version 2: August 2003

[ Back ]

Similar pages

Name	OS	Title
eucset	Tru64	Sets and gets EUC code widths for the terminal
getwidth	IRIX	get information on supplementary code sets
wchrtbl	IRIX	generate character classification and conversion tables for ASCII and supplementary code sets
sconv	HP-UX	hp9000 utility for Simplified-Chinese code code-converter
dmFS1016Encode	IRIX	implements the US Federal Standard 1016 4800/7200/9600 bits/s CELP (Code Excited Linear Predictive) Voice Code
dmFS1016Decode	IRIX	implements the US Federal Standard 1016 4800/7200/9600 bits/s CELP (Code Excited Linear Predictive) Voice Code
xsubpp	IRIX	compiler to convert Perl XS code into C code
xsubpp	OpenBSD	compiler to convert Perl XS code into C code
coco	Tru64	code converter for any of Mule's code
Benchmark	IRIX	benchmark running times of code timethis - run a chunk of code several times timethese - run several chunks of

newsletter delivery service

Contents