*nix Documentation Project
·  Home
 +   man pages
·  Linux HOWTOs
·  FreeBSD Tips
·  *niX Forums

  man pages->Tru64 Unix man pages -> charmap (4)              
Title
Content
Arch
Section
 

charmap(4)

Contents


NAME    [Toc]    [Back]

       charmap - Defines character symbols as character encodings

DESCRIPTION    [Toc]    [Back]

       The character set description (charmap) file defines character
  symbols  as  character  encodings. This file is the
       source file for a coded character  set,  or  codeset.  All
       supported  codesets  have the Portable Character Set (PCS)
       as a proper subset.  The PCS  consists  of  the  following
       character  symbols  (listed by their standardized symbolic
       names) and hexadecimal encodings:

       -------------------------------------------
       Symbol Name           Hexadecimal Encoding
       -------------------------------------------
       <NUL>                 \x00
       <SOH>                 \x01
       <STX>                 \x02
       <ETX>                 \x03
       <EOT>                 \x04
       <ENQ>                 \x05
       <ACK>                 \x06
       <alert>               \x07
       <backspace>           \x08
       <tab>                 \x09
       <newline>             \x0A
       <vertical-tab>        \x0B
       <form-feed>           \x0C
       <carriage-return>     \x0D
       <SO>                  \x0E
       <SI>                  \x0F
       <DLE>                 \x10
       <DC1>                 \x11
       <DC2>                 \x12
       <DC3>                 \x13
       <DC4>                 \x14
       <NAK>                 \x15
       <SYN>                 \x16
       <ETB>                 \x17
       <CAN>                 \x18
       <EM>                  \x19
       <SUB>                 \x1A
       <ESC>                 \x1B
       <IS4>                 \x1C
       <IS3>                 \x1D
       <IS2>                 \x1E
       <IS1>                 \x1F
       <space>               \x20
       <exclamation-mark>    \x21
       <quotation-mark>      \x22
       <number-sign>         \x23
       <dollar-sign>         \x24
       <percent>             \x25
       <ampersand>           \x26
       <apostrophe>          \x27
       <left-parenthesis>    \x28
       <right-parenthesis>   \x29
       <asterisk>            \x2A
       <plus-sign>           \x2B
       <comma>               \x2C
       <hyphen>              \x2D

       <period>              \x2E
       <slash>               \x2F
       <zero>                \x30
       <one>                 \x31
       <two>                 \x32
       <three>               \x33
       <four>                \x34
       <five>                \x35
       <six>                 \x36
       <seven>               \x37
       <eight>               \x38
       <nine>                \x39
       <colon>               \x3A
       <semi-colon>          \x3B
       <less-than>           \x3C
       <equal-sign>          \x3D
       <greater-than>        \x3E
       <question-mark>       \x3F
       <commercial-at>       \x40
       <A>                   \x41
       <B>                   \x42
       <C>                   \x43
       <D>                   \x44
       <E>                   \x45
       <F>                   \x46
       <G>                   \x47
       <H>                   \x48
       <I>                   \x49
       <J>                   \x4A
       <K>                   \x4B
       <L>                   \x4C
       <M>                   \x4D
       <N>                   \x4E
       <O>                   \x4F
       <P>                   \x50
       <Q>                   \x51
       <R>                   \x52
       <S>                   \x53
       <T>                   \x54
       <U>                   \x55
       <V>                   \x56
       <W>                   \x57
       <X>                   \x58
       <Y>                   \x59
       <Z>                   \x5A
       <left-bracket>        \x5B
       <backslash>           \x5C
       <right-bracket>       \x5D
       <circumflex>          \x5E
       <underscore>          \x5F
       <grave-accent>        \x60
       <a>                   \x61
       <b>                   \x62
       <c>                   \x63
       <d>                   \x64
       <e>                   \x65
       <f>                   \x66
       <g>                   \x67
       <h>                   \x68
       <i>                   \x69
       <j>                   \x6A
       <k>                   \x6B
       <l>                   \x6C
       <m>                   \x6D
       <n>                   \x6E

       <o>                   \x6F
       <p>                   \x70
       <q>                   \x71
       <r>                   \x72
       <s>                   \x73
       <t>                   \x74
       <u>                   \x75
       <v>                   \x76
       <w>                   \x77
       <x>                   \x78
       <y>                   \x79
       <z>                   \x7A
       <left-brace>          \x7B
       <vertical-line>       \x7C
       <right-brace>         \x7D
       <tilde>               \x7E
       <DEL>                 \x7F
       -------------------------------------------

       The charmap file has the following components: An optional
       special symbolic name declarations section

              Each declaration in this section consists of a special
 symbolic name, followed by one or  more  space
              or  tab characters, and a value. The following list
              describes the special symbolic names that  you  can
              include  in the declarations section: Specifies the
              name of the codeset for which the charmap  file  is
              defined.   This value determines the value returned
              by  the  nl_langinfo   (CODESET)   subroutine.   If
              <code_set_name>  is  not declared, the name for the
              Portable Character Set is used.  Specifies the maximum
  number  of bytes in a character for the codeset.
  Valid values are 1 to 4.  The  default  value
              is  1.   Specifies the minimum number of bytes in a
              character for the  codeset.   Since  all  supported
              codesets  have  the  Portable  Character  Set  as a
              proper subset, this value must be 1.  Specifies the
              escape  character  that indicates encodings in hexadecimal
 or octal notation.  The default value is a
              \  (backslash).   Specifies  the  character used to
              indicate a comment  within  a  charmap  file.   The
              default  value  is  a # (number sign).  The CHARMAP
              section header

              This header marks the beginning of the section that
              associates  character symbols with encodings.  Mapping
 statements for characters in the codeset

              Each statement lists a symbolic name for a  character
  and  its  associated encoding. The format of a
              mapping statement is: <char_symbol> encoding

              A symbolic  name  begins  with  the  <  (left-angle
              bracket) character and ends with the > (right-angle
              bracket) character.  The characters for char_symbol
              (between  <  and  >) can be any characters from the
              Portable Character  Set,  except  for  control  and
              space  characters.  The right-angle bracket (>) can
              occur in char_symbol as well in the  last  position
              of  the name. You must precede all > characters but
              the last one with the escape character  (as  specified
 by the <escape_char> special symbolic name).

              The format of a mapping statement is:

              <char_symbol> encoding

              An  encoding  is specified as one or more character
              constants, with the  maximum  number  of  character
              constants  specified  by  the  <mb_cur_max> special
              symbolic name.  The encoding may be listed as decimal,
  octal, or hexadecimal constants with the following
 formats: \xxx,  where  x  is  a  hexadecimal
              digit  \ooo or \oo, where o is an octal digit \dddd
              or \ddd, where d is a decimal digit

              Some examples of character symbol  definitions  are
              the following:

              <A>          \d65          #decimal   constant  <B>
              \x42           #hexadecimal    constant    <j10101>
              \x81\xA1    #multiple hexadecimal constants

              A range of symbolic names and corresponding encoded
              values may also be defined,  where  the  nonnumeric
              prefix  for  each  symbolic name is common, and the
              numeric portion of the  second  symbolic  name   is
              equal to or greater than the numeric portion of the
              first symbolic name.  In this  format,  a  symbolic
              name  value  consists  of  zero  or more nonnumeric
              characters followed by an integer of  one  or  more
              decimal   digits.   This format defines a series of
              symbolic   names.   For   example,    the    string
              <j0101>...<j0104>  is  interpreted  as the <j0101>,
              <j0102>, <j0103>, and <j0104>  symbolic  names,  in
              that order.

              In  statements  defining  ranges of symbolic names,
              the encoded value listed is the value for the first
              symbolic  name  in  the  range. Subsequent symbolic
              names have encoded values in increasing order.  For
              example:

              <j0101>...<j0104>        \d129\d254

              The preceding statement is interpreted as follows:

              <j0101>   \d129\d254   <j0102>  \d129\d255  <j0103>
              \d130\d0 <j0104> \d130\d1

              Although you cannot assign  multiple  encodings  to
              one  symbolic  name,  you can create multiple names
              for one encoded value.   This  is  allowed  because
              some  characters  have  several  common names.  For
              example, the "."  character is called a  period  in
              some parts of the world, and a full stop in others.
              Both names may appear in the charmap.  For example:

              <period>        \x2e <full-stop>     \x2e

              If  used,  comments  must  begin with the character
              specified by the  <comment_char>  special  symbolic
              name.  When  an  entire line is a comment, you must
              specify <comment_char> in the first column  of  the
              line.  The END CHARMAP trailer

              This  entry denotes the end of character map statements.


       The following example is a portion of a  possible  charmap
       file:

       CHARMAP  <code_set_name>          "ISO8859-1" <mb_cur_max>
       1  <mb_cur_min>             1  <escape_char>             \
       <comment_char>          #

       <NUL>                    \x00 <SOH>                   \x01
       <STX>                   \x02 <ETX>                    \x03
       <EOT>                    \x04 <ENQ>                   \x05
       <ACK>                   \x06 <alert>                  \x07
       <backspace>              \x09 <tab>                   \x09
       <newline>               \x0a <vertical-tab>           \x0b
       <form-feed>              \x0c <carriage-return>       \x0d
       END CHARMAP

FILES    [Toc]    [Back]

       Character set description (charmap) source files for  supported
  locales.   The /usr/lib/nls/loc/charmaps directory
       does not exist when source files for installed locales are
       not provided.

SEE ALSO    [Toc]    [Back]

      
      
       Commands: locale(1), localedef(1)

       Files: locale(4)



                                                       charmap(4)
[ Back ]
 Similar pages
Name OS Title
charmap Linux character symbols to define character encodings
wins_wch Tru64 Insert a complex character and rendition before the character under the cursor in a Curses window
curs_ins_wch Tru64 Insert a complex character and rendition before the character under the cursor in a Curses window
fold_string_w Tru64 maps one wide-character string to another, performing the specified Unicode character transformation
mvwins_wch Tru64 Insert a complex character and rendition before the character under the cursor in a Curses window
fcd IRIX Constructs a Cray character pointer in Fortran character Descriptor (FCD) format
ins_wch Tru64 Insert a complex character and rendition before the character under the cursor in a Curses window
mvins_wch Tru64 Insert a complex character and rendition before the character under the cursor in a Curses window
mvwinswch Tru64 Insert a wchar_t character before the character under the cursor in a Curses window
mvinswch Tru64 Insert a wchar_t character before the character under the cursor in a Curses window
Copyright © 2004-2005 DeniX Solutions SRL
newsletter delivery service