locale - Tru64

· Home

+ man pages

-> Linux

-> FreeBSD

-> OpenBSD

-> NetBSD

-> Tru64 Unix

-> HP-UX 11i

-> IRIX

· Linux HOWTOs

· FreeBSD Tips

· *niX Forums

man pages->Tru64 Unix man pages -> locale (4)

locale(4)

NAME [Toc] [Back]

       locale  -  Contains one or more categories that describe a
       locale

DESCRIPTION [Toc] [Back]

       A locale definition source file contains one or more categories
  that  describe a locale.  You can convert a locale
       definition  source  file  into  a  locale  by  using   the
       localedef  command.  Locales can be modified only by editing
 a locale definition source file  and  then  using  the
       localedef command again on the new source file.

       Each  locale  source  file  section  defines a category of
       locale data.  A source file cannot contain more  than  one
       section for the same category.

       The  following  standard categories are supported: Defines
       character or string collation information Defines  character
  classification,  case conversion, and other character
       properties or attributes Defines the format  for  affirmative
  and negative responses Defines rules and symbols for
       formatting monetary numeric information Defines a list  of
       rules  and  symbols  for  formatting  nonmonetary  numeric
       information Defines a list of rules and symbols  for  formatting
 time and date information

       You  can include optional declarations at the beginning of
       your locale source file to override  the  default  comment
       and escape characters used in locale category definitions:
       Escape character

              The escape character is used in decimal or hexadecimal
  constants  when  these  are  specified in the
              locale file. The default escape  character  is  the
              backslash  (\). To define another escape character,
              include a line with the following format:

              escape_char  <char_symbol> Comment character

              The comment character is the first character of any
              comment  entries  in  the  locale file. The default
              comment character is the number sign (#). To define
              another  comment  character, use the following format:


              comment_char  <char_symbol>

       In the preceding formats, <char_symbol> is the character's
       symbolic name as defined in the charmap file used to build
       the  locale's  codeset.   One  or  more  blank  characters
       (spaces or tabs) must separate escape_char or comment_char
       from <char_symbol>.

       Each category source definition consists of the following:
       The  category  header  (category_name) The associated keyword/value
 pairs that comprise the category body The category
 trailer (END category_name)

       For example:

       LC_CTYPE <source for LC_CTYPE category> END LC_CTYPE

       The  source  for  all of the categories is specified using
       keywords, strings, character literals, and character  symbols.
   Each  keyword  identifies either a definition or a
       rule.  The remainder of the statement containing the  keyword
  contains  the operands to the keyword.  Operands are
       separated from the keyword by one or more blank characters
       (spaces or tabs). A statement may be continued on the next
       line by placing a \  (backslash)  as  the  last  character
       before  the  newline  character  that terminates the line.
       Lines containing the # (comment character)  in  the  first
       column are treated as comment lines.

       A  symbolic  name  begins  with the < (left-angle bracket)
       character and ends with the > (right-angle bracket)  character.
   The characters between the < and the > can be any
       characters from the Portable  Character  Set,  except  for
       control  and  space characters. For example, <A-diaeresis>
       could be a symbolic name for  a  character.  Any  symbolic
       name  referenced in the locale source file must be defined
       in the Portable Character Set  or  in  the  character  set
       description (charmap) file for that locale.

       A  character  literal  is  the character itself, or else a
       decimal, hexadecimal, or octal constant.  A  decimal  constant
 is of the following form:

       \dddd or \ddd

       where d is a decimal digit.

       A hexadecimal constant is of the following form:

       \xxx

       where x is a hexadecimal digit.

       An octal constant is of the following form:

       \ooo or \oo

       where o is an octal digit.

       The explicit definition of each category in a locale definition
 source file is not required.  When  a  category  is
       undefined in a locale definition source file, the category
       value defaults to the value in the C locale definition.

   The LC_COLLATE Category    [Toc]    [Back]
       The LC_COLLATE category defines the relative order between
       collating elements.

       A  collation  element is the unit of comparison for collation.
  A  collation  element  may  be  a  character  or  a
       sequence  of  characters.  Every  collation element in the
       locale has a set of weights, which determine if the collation
  element  collates   before,  equal  to, or after the
       other collation elements in the  locale.   Each  collation
       element  is  assigned  collation  weights by the localedef
       command when the locale definition  source  file  is  compiled.
   These collation weights are then used by applications
 programs that compare strings.

       Comparison of strings is performed by comparing the collation
  weights of each character in the string until either
       a difference is found or the strings are determined to  be
       equal.   This comparison may be performed several times if
       the locale defines multiple collation orders.   For  example,
  in the French locale, the strings are compared using
       a primary set of collation weights.  If they are equal  on
       the  basis  of  this  comparison,  they are compared again
       using a secondary set of collation weights.   A  collating
       element  has a set of collation weights associated with it
       that is equal to the number of  collation  orders  defined
       for the locale.

       Every  character  defined  in  the  charmap file (or every
       character in the portable character set if no charmap file
       is  specified)  is itself a collating element.  Additional
       collating elements can be defined using the collating-element
 statement.  The syntax is as follows:

       collating-element <character_symbol> from <string>

       The LC_COLLATE category begins with the keyword LC_COLLATE
       and ends with the keyword END LC_COLLATE.

       The following keywords are recognized  in  the  LC_COLLATE
       category:  The  copy  statement  specifies  the name of an
       existing locale to be used as the definition of this category.
  If you specify a copy statement, you can specify no
       other keywords in  the  category.   The  collating-element
       statement is used to specify multicharacter collating elements.


              The character_symbol argument defines  a  collating
              element  that is a string of one or more characters
              as a single collating element.  The  character_symbol
  argument cannot duplicate any symbolic name in
              the current charmap file or any other symbolic name
              defined  in  this collation definition.  The string
              argument specifies a string of two or more  characters
 that define the character_symbol argument. The
              following are examples of the syntax for  the  collating-element
 statement:

              collating-element <ch> from "<c><h>" collating-element
 <e-acute> from "<acute><e>"  collating-element
              <11> from "<1><1>"

              A  character_symbol argument defined by the collating-element
 statement is recognized only within the
              LC_COLLATE  category.   The collating-symbol statement
 is used to specify collation symbols  for  use
              in collation sequence statements.

              The syntax for the collating-symbol statement is as
              follows:

              collating-symbol <collating_symbol>

              The collating_symbol argument cannot duplicate  any
              symbolic  name  in  the current charmap file or any
              other symbolic name defined in this collation definition.
   The  following are examples of collatingsymbol
 statements:

              collating-symbol   <UPPER_CASE>    collating-symbol
              <HIGH>

              A  collating_symbol argument defined by the collating-symbol
 statement is recognized only within  the
              LC_COLLATE  category.  The order_start statement is
              followed by one or more collation order statements,
              assigning  collation weights to collating elements.
              This statement is mandatory.

              The syntax for the order_start statement is as follows:


              order_start <sort_rules>;<sort_rules>;\...;<sort_rules>
              collation_order_statements order_end

              The sort_rules have the following syntax:

              keyword, keyword,...,keyword

              where keyword is the keyword forward, backward,  or
              position.

              The  sort_rules  directives  are optional.  If present,
 they define the rules to apply during  string
              comparison.  The  number  of  specified  sort_rules
              directives defines the number of weights each  collating
 element is assigned; that is, the directives
              define  the  number  of  collation  orders  in  the
              locale.   If  no sort_rules directives are present,
              one forward directive is  assumed  and  comparisons
              are  made on a character basis rather than a string
              basis.

              If directives are  present,  the  first  sort_rules
              directive  applies  when comparing strings that use
              the primary  weight,   the  second  when  comparing
              strings  that  use the secondary weight, and so on.
              Each set of sort_rules directives is separated by a
              ;  (semicolon).  A sort_rules directive consists of
              one or more comma-separated keywords.  The  following
  keywords  are supported: Specifies that collation
 weight comparisons proceed from the  beginning
              of  a  string  to the end of the string.  Specifies
              that collation weight comparisons proceed from  the
              end  of  a  string  to the beginning of the string.
              Specifies that collation  weight  comparisons  consider
  the relative position of nonignored elements
              in the string.  That  is,  if  strings  compare  as
              equal,  the element with the shortest distance from
              the  starting  point  of  the  comparison  collates
              first.

              The  forward  and  backward  keywords  are mutually
              exclusive.   The  following  is  an  example  of  a
              sort_rules directive:

              order_start        forward;backward


       The following syntax rules apply to collation order statements:
 Each collation order statement consists of a <character_symbol>
 specification, followed by white space and a
       set of collation orders.  Characters in the character  set
       can  be  explicitly  specified  in the collation orders or
       implicitly specified using the ellipsis symbol  (...).   A
       collation  order  statement that begins with the UNDEFINED
       special symbol specifies any characters that  are  in  the
       character  set  and not explicitly or implicitly specified
       by other collation order statements.

       The optional operands for each collation element are  used
       to  define  the  primary, secondary, or subsequent weights
       for the collating element.  The special symbol  IGNORE  is
       used to indicate a collating element that is to be ignored
       when strings are compared.

       An ellipsis keyword appearing in place of a collating_element_list
  indicates  the  weights are to be assigned, for
       the characters in the  identified  range,  in  numerically
       increasing  order from the weight for the character symbol
       on the left-hand side of the preceding statement.

       The use of the ellipsis keyword results in a  locale  that
       may collate differently when compiled with different character
 set description (charmap) source  files.   For  this
       reason,  the  localedef  command will issue a warning when
       the ellipsis keyword is encountered.

       The UNDEFINED special symbol includes all coded  character
       set  values  not  specified explicitly or with an ellipsis
       symbol.  These characters are inserted  in  the  character
       collation  order  at  the point indicated by the UNDEFINED
       special symbol in the order of their  character  code  set
       values.   If  no  UNDEFINED  special symbol exists and the
       collation order does not specify  all  collation  elements
       from  the coded character set, a warning is issued and all
       undefined characters are placed at the end of the  character
 collation order.

       The following is an example of a collation order statement
       in the LC_COLLATE locale definition source file category:

       order_start             forward;backward         UNDEFINED
       IGNORE;IGNORE   <LOW>   <space>          <LOW>;<space>  ..
       <LOW>;...  <a>             <a>;<a> <a-acute>       <a>;<aacute>
       <a-grave>             <a>;<a-grave>       <A>
       <a>;<A>    <A-acute>           <a>;<A-acute>     <A-grave>
       <a>;<A-grave>      <ch>                 <ch>;<ch>     <Ch>
       <ch>;<Ch>       <s>                    <s>;<s>        <ss>
       <s><s>;<s><s>  <eszet>          <s><s>;<eszet><eszet>  ...
       <HIGH>;...  <HIGH> order_end

       This example is interpreted as follows: The UNDEFINED special
 symbol indicates that all characters not specified in
       the definition (either explicitly or by the ellipsis  symbol)
  are  ignored  for collation purposes.  All collating
       elements between <space> and <a>  have  the  same  primary
       equivalence  class  and individual secondary weights based
       on their coded character set values.  All versions of  the
       letter  a  (uppercase  and  lowercase, and with or without
       diacriticals) belong to the same primary collation  class.
       The <c><h> multicharacter collating element is represented
       by the <ch> collating symbol and belongs to the same  primary
  equivalence  class as the <C><h> multicharacter collating
 element.  The <eszet> character is collated  as  an
       <s><s> string.  That is, one <eszet> character is expanded
       to two characters before comparing.

   The LC_CTYPE Category    [Toc]    [Back]
       The LC_CTYPE category of a locale definition  source  file
       defines  character  classification,  case  conversion, and
       other character attributes.  This category begins with  an
       LC_CTYPE  category  header  and  terminates  with  an  END
       LC_CTYPE category trailer.

       All operands for LC_CTYPE category statements are  defined
       as lists of characters.  Each list consists of one or more
       semicolon-separated  characters  or   symbolic   character
       names. An ellipsis (...) can represent a series of characters;
 for example, <a>;...;<z> represents  the  characters
       in the range a through z.

       There are multiple sets of property keywords that are recognized
 in the LC_CTYPE category. One set  contains  property
  keywords and associated rules defined for locales by
       the XSH standard. A keyword in this set can be defined  in
       locales based on any codeset, assuming that the associated
       property applies to characters in the  language  supported
       by the locale. Another set of property keywords is defined
       by the Unicode standard.  Define these  keywords  only  in
       locales  using  one of the Unicode character encoding formats.
 Some national language standards also define properties
  for characters.  Japanese locales define quite a few
       supplemental properties to  conform  with  national  standards.


       The  following  two  subsections describe the sets of keywords
 as defined by XSH and Unicode. See  Japanese(5)  for
       descriptions of properties defined in Japanese locales.

   Property Keywords Defined by the XSH Standard    [Toc]    [Back]
       The  following  keywords  defined by XSH are recognized in
       the LC_CTYPE  category.  In  the  descriptions,  the  term
       "automatically  included"  means  that  an  error does not
       occur if the referenced characters are included  or  omitted.
   The characters will be provided if they are missing
       and will be accepted if they are present.

       Specifies the name of an existing locale to be used as the
       definition of this category

              If  you  include a copy statement, no other keyword
              can be specified.  Defines uppercase letter characters


              No character defined by the cntrl, digit, punct, or
              space keyword can be specified.  If  upper  is  not
              defined,  A  through  Z  default to upper.  Defines
              lowercase letter characters

              No character defined by the cntrl, digit, punct, or
              space  keyword  can  be specified.  If lower is not
              defined, a through z default to lower.  Defines all
              letter characters

              No character defined by the cntrl, digit, punct, or
              space keyword can be specified. Characters  defined
              by  the  upper and lower keywords are automatically
              included in this character class.  Defines  numeric
              digit characters

              Only  the  digits  0, 1, 2, 3, 4, 5, 6, 7, 8, and 9
              can be specified.   If  digit  is  not  defined,  0
              through  9  default  to digit.  Defines white-space
              characters

              No character defined by the  upper,  lower,  alpha,
              digit,  graph,  or xdigit keyword can be specified.
              If space is not defined, the space, formfeed,  newline,
  carriage-return, tab, and vertical tab characters
 default to space.  Defines  control  characters


              No  character  defined  by the upper, lower, alpha,
              digit, punct, graph, print, or xdigit  keyword  can
              be specified.  Defines punctuation characters

              The  space  character and characters defined by the
              upper, lower, alpha, digit, cntrl, or  xdigit  keywords
 cannot be specified.  Defines printable characters,
 excluding the space character

              If  this  keyword  is  not  specified,   characters
              defined  by the upper, lower, alpha, digit, xdigit,
              and punct keywords are  automatically  included  in
              this character class.   No character defined by the
              cntrl keyword can be specified.  Defines  printable
              characters, including the space character

              If this keyword is not specified, the space character
 and characters defined  by  the  upper,  lower,
              alpha,  digit, xdigit, and punct keywords are automatically
 included in  this  character  class.   No
              character defined by the cntrl keyword can be specified.
  Defines hexadecimal digit characters

              Only the digits 0, 1, 2, 3, 4, 5, 6, 7,  8,  and  9
              can  be  specified.  Any character can be specified
              for the hexadecimal values for 10 to  15,  however.
              These  alternate hexadecimal digits are not used by
              standard conversion routines when converting  digit
              strings from hexadecimal to numeric quantities.  If
              xdigit is not defined, the numbers 0 through 9  and
              the  letters A through F and a through f default to
              xdigit.  Defines blank characters

              If this keyword is not  specified,  the  space  and
              horizontal  tab  characters  are  included  in this
              character class.  Any characters  defined  by  this
              statement  are  automatically included in the space
              class.  Defines the mapping of lowercase characters
              to uppercase characters

              Operands  for  this  keyword consist of comma-separated
 character  pairs.   Each  character  pair  is
              enclosed in () (parentheses) and separated from the
              next pair by a ; (semicolon).  The first  character
              in  each  pair is considered a lowercase character;
              the second character  is  considered  an  uppercase
              character.   Only  characters  defined by the lower
              and upper keywords can be specified.  If toupper is
              not  defined,  a through z is mapped to A through Z
              by default.  Defines the mapping of uppercase characters
 to lowercase characters

              Operands  for  this  keyword consist of comma-separated
 character  pairs.   Each  character  pair  is
              enclosed in () (parentheses) and separated from the
              next pair by a ; (semicolon). The  first  character
              in  each pair is considered an uppercase character;
              the second  character  is  considered  a  lowercase
              character.   Only  characters  defined by the lower
              and upper keywords can be specified.

              The tolower keyword is optional.  If  this  keyword
              is  not  specified,  the  mapping  defaults  to the
              reverse mapping of the toupper keyword,  if  specified.
  If the toupper and tolower keywords are both
              unspecified, the mapping for each defaults to  that
              of the C locale.

       Additional  keywords can be specified to define supplemental
 character classifications.  For example:

       charclass vowel vowel        <a>;<e>;<i>;<o>;<u>;<y>

       Within the context of the XSH standard, the Unicode  character
  properties  discussed  in  the next subsection fall
       into the category of  supplemental  property  definitions.
       Note  that  a   supplemental  property  definition  can be
       accessed in programs only by using the wctype() and  iswctype()
 interfaces.

       The LC_CTYPE category does not support multicharacter elements.
 For example, the German Eszet character  is  traditionally
  classified  as  a lowercase letter.  There is no
       corresponding uppercase letter; in  proper  capitalization
       of German text, the Eszet character is replaced by the two
       characters SS.  This kind of conversion is outside of  the
       scope of the toupper and tolower keywords.

       The  following  is an example of a possible LC_CTYPE category
 listed in a locale definition source file:

       LC_CTYPE  #"alpha"  is  by  default  "upper"  and  "lower"
       #"alnum"  is by definition "alpha" and "digit" #"print" is
       by  default  "alnum",  "punct"  and  the  space  character
       #"graph"  is  by default "alnum" and "punct" #"tolower" is
       by default  the  reverse  mapping  of  "toupper"  #  upper
       <A>;<B>;<C>;<D>;<E>;<F>;<G>;<H>;<I>;<J>;<K>;<L>;<M>;\
               <N>;<O>;<P>;<Q>;<R>;<S>;<T>;<U>;<V>;<W>;<X>;<Y>;<Z>
       #                                                    lower
       <a>;<b>;<c>;<d>;<e>;<f>;<g>;<h>;<i>;<j>;<k>;<l>;<m>;\
               <n>;<o>;<p>;<q>;<r>;<s>;<t>;<u>;<v>;<w>;<x>;<y>;<z>
       # digit   <zero>;<one>;<two>;<three>;<four>;<five>;<six>;\
               <seven>;<eight>;<nine>   #   space     <tab>;<newline>;<vertical-tab>;<form-feed>;\

               <carriage-return>;<space>         #          cntrl
       <alert>;<backspace>;<tab>;<newline>;<vertical-tab>;\
               <form-feed>;<carriage-return>;<NUL>;<SOH>;<STX>;\
               <ETX>;<EOT>;<ENQ>;<ACK>;<SO>;<SI>;<DLE>;<DC1>;<DC2>;\
               <DC3>;<DC4>;<NAK>;<SYN>;<ETB>;<CAN>;<EM>;<SUB>;\
               <ESC>;<IS4>;<IS3>;<IS2>;<IS1>;<DEL>    #     punct
       <exclamation-mark>;<quotation-mark>;<number-sign>;\
               <dollar-sign>;<percent-sign>;<ampersand>;<asterisk>;\

               <apostrophe>;<left-parenthesis>;<right-parenthesis>;\

               <plus-sign>;<comma>;<hyphen>;<period>;<slash>;\
               <colon>;<semicolon>;<less-than-sign>;<equalssign>;\

               <greater-than-sign>;<question-mark>;<commercialat>;\

               <left-square-bracket>;<backslash>;<circumflex>;\
               <right-square-bracket>;<underline>;<graveaccent>;\

               <left-curly-bracket>;<vertical-line>;<tilde>;\
               <right-curly-bracket>           #           xdigit
       <zero>;<one>;<two>;<three>;<four>;<five>;<six>;\
               <seven>;<eight>;<nine>;<A>;<B>;<C>;<D>;<E>;<F>;\
               <a>;<b>;<c>;<d>;<e>;<f>  # blank   <space>;<tab> #
       toupper
       (<a>,<A>);(<b>,<B>);(<c>,<C>);(<d>,<D>);(<e>,<E>);\
               (<f>,<F>);(<g>,<G>);(<h>,<H>);(<i>,<I>);(<j>,<J>);\
               (<k>,<K>);(<l>,<L>);(<m>,<M>);(<n>,<N>);(<o>,<O>);\
               (<p>,<P>);(<q>,<Q>);(<r>,<R>);(<s>,<S>);(<t>,<T>);\
               (<u>,<U>);(<v>,<V>);(<w>,<W>);(<x>,<X>);(<y>,<Y>);\
               (<z>,<Z>) # END LC_CTYPE


   Property Keywords Defined by the Unicode Standard    [Toc]    [Back]
       Property  keywords  defined by the Unicode standard can be
       normative or informative. For example, a  normative  property
  might  tell  you  whether a character is a letter, a
       digit, or something else  while  an  informative  property
       might tell you whether a letter is uppercase or lowercase.
       There is also a set of  properties,  all  normative,  that
       applies  only to languages whose scripts are bidirectional
       (like Chinese, Korean, Japanese, and Arabic).  Mark,  nonspacing
  Mark,  spacing  combining Mark, enclosing Number,
       decimal digit  Number,  letter  Number,  other  Separator,
       space  Separator, line Separator, paragraph Other, control
       Other, format Other, surrogate Other, private  use  Other,
       not  assigned  Letter, uppercase Letter, lowercase Letter,
       titlecase Letter, modifier Letter, other Punctuation, connector
 Punctuation, dash Punctuation, final quote Punctuation,
 initial quote Punctuation, open  Punctuation,  close
       Punctuation,  other  Symbol, math Symbol, currency Symbol,
       modifier Symbol, other Left-right;  for  most  alphabetic,
       syllabic,  and  logographic characters (such as ideographs
       in Asian languages) Right-left; for  Arabic,  Hebrew,  and
       punctuation  in  those  languages European number European
       number separator European number terminator Arabic  number
       Common  number separator Block separator Segment separator
       Whitespace Other neutrals: all other characters like punctuation
 and symbols

       For locales included with the Tru64 UNIX product, only the
       locales include Unicode property keywords in  addition  to
       those  specified in the XSH standard. Programmers who want
       to use specific Unicode keywords with locales to determine
       a  character's  classification  use the wctype() and iswctype()
 functions. Other  functions,  such  as  iswdigit(),
       iswalpha(),  and  toupper(),  access  only  definitions of
       properties specified in the XSH standard. When equivalence
       exists  between  an  XSH  property and one or more Unicode
       properties, locales support properties as defined by  both
       standards.  XSH property keywords can be mapped to Unicode
       property keywords as follows: Uppercase letter: maps to Lu
       Lowercase letter: maps to Ll Digit: maps to Nd, Nl, and No
       combined Hexidecimal digit: includes  specific  characters
       (0-9, a-f, and A-F) A control or format character: maps to
       Cc and Cf Any letter: maps to Lu, Ll, Lt, Lm, and Lo  combined
  Any  letter  or number: maps to Lu, Ll, Lt, Lm, Lo,
       Nd, Nl, and No combined Any punctuation character: maps to
       Pc,  Pd,  Ps,  Pe,  Pi,  Pf, and Po combined Any graphical
       character: maps to Lu, Ll, Lt, Lm, Lo, Nd, Nl, No, Pc, Pd,
       Ps, Pe, Pi, Pf, Po, Sm, Sc, Sk, and So combined Any printable
 character: maps to a combination of all Unicode properties
  with  the  exception of Cc, Cf, Cn, Co, and Cs.  A
       space separator: maps to Zs Any separator: maps to Zl, Zp,
       and Zs

       When  operating  in  a *.UTF-8 locale, functions that test
       for a property defined in the XSH standard implicitly test
       a  character for any of the Unicode properties that map to
       the XSH property. For  example,  the  iswdigit()  function
       implicitly  tests  for  the  Nd,  Nl, and No properties as
       defined by the Unicode standard.



   The LC_MESSAGES Category    [Toc]    [Back]
       The LC_MESSAGES category of  a  locale  definition  source
       file  defines the format for affirmative and negative system
 responses. This category begins  with  an  LC_MESSAGES
       category  header  and  terminates  with an END LC_MESSAGES
       category trailer.

       All operands for the LC_MESSAGES category are  defined  as
       strings  or  extended  regular  expressions bounded by " "
       (double quotes). These operands  are  separated  from  the
       keyword  they  define  by  one  or  more  blank characters
       (spaces or tabs). Two adjacent " " (double  quotes)  indicate
 an undefined value.

       The  following  keywords are recognized in the LC_MESSAGES
       category: Specifies the name of an existing locale  to  be
       used as the definition of this category

              If you include a copy statement, you cannot include
              other  keywords.   Specifies  an  extended  regular
              expression  that  describes the acceptable affirmative
 response to a question expecting  an  affirmative
  or  negative  response  Specifies an extended
              regular expression that  describes  the  acceptable
              negative response to a question expecting an affirmative
 or negative response Specifies the  locale's
              equivalent of an acceptable affirmative response

              This  string  is accessible to applications through
              the nl_langinfo subroutine as nl_langinfo (YESSTR).
              Note that yesstr is likely to be withdrawn from the
              XPG4 standard; yesexpr is the recommended  alternative.
   Specifies  the  locale's  equivalent  of an
              acceptable negative response

              This string is accessible to  applications  through
              the  nl_langinfo subroutine as nl_langinfo (NOSTR).
              Note that nostr is likely to be withdrawn from  the
              XPG4  standard;  noexpr is the recommended alternative.


       The following is an example of a possible LC_MESSAGES category
 listed in a locale definition source file:

       LC_MESSAGES     #    yesexpr    "<circumflex><left-squarebracket><y><Y>\
 <right-square-bracket>" noexpr   "<circumflex><left-square-bracket><n><N>\
  <right-square-bracket>"
       yesstr  "<y><e><s>" nostr   "<n><o>" # END LC_MESSAGES


   The LC_MONETARY Category    [Toc]    [Back]
       The LC_MONETARY category of  a  locale  definition  source
       file  defines  rules  and  symbols for formatting monetary
       numeric information.  This category begins with an LC_MONETARY
  category header and terminates with an END LC_MONETARY
 category trailer.

       All operands for the  LC_MONETARY  category  keywords  are
       defined  as  string  or integer values.  String values are
       bounded by " " (double quotes).  All values are  separated
       from  the keyword they define by one or more blank characters
 (spaces or tabs). Two adjacent "  "  (double  quotes)
       indicate  an  undefined string value.  A -1 (negative one)
       indicates an undefined integer value.

       The following keywords are recognized in  the  LC_MONETARY
       category:  Specifies  the name of an existing locale to be
       used as the definition of this category

              If you include a copy statement, no  other  keyword
              will  be  specified.  Specifies the string used for
              the international currency symbol

              The operand for the int_curr_symbol  keyword  is  a
              4-character  string.   The  first  three characters
              contain the alphabetic international currency  symbol.
   The  fourth  character specifies a character
              separator between the international currency symbol
              and a monetary quantity.  Specifies the string used
              for  the  local  currency  symbol.   Specifies  the
              string  used for the decimal delimiter that is used
              to format monetary quantities Specifies the character
  separator used for grouping digits to the left
              of the  decimal  delimiter  in  formatted  monetary
              quantities Specifies a string that defines the size
              of each group of digits in formatted monetary quantities


              The  operand  for the mon_grouping keyword consists
              of  a  sequence  of  semicolon-separated  integers.
              Each  integer  specifies  the number of digits in a
              group.  The initial integer defines the size of the
              group  immediately   to  the  left  of  the decimal
              delimiter.  The subsequent integers define succeeding
  groups  to the left of the previous group.  If
              the last  integer  is  not  -1,  grouping  for  any
              remaining digits is performed using that that integer.
   If the last integer is -1, no further grouping
 is performed.

              The  following  is an example of the interpretation
              of the mon_grouping statement.  Assuming the  value
              to  be  formatted  is 123456789 and the operand for
              the mon_thousands_sep keyword is '  (single  quotation
  mark), the following results occur: Formatted
              Value    123456'789     123'456'789     1234'56'789
              12'34'56'789  Specifies the string used to indicate
              a nonnegative-valued  formatted  monetary  quantity
              Specifies  the  string used to indicate a negativevalued
 formatted  monetary  quantity  Specifies  an
              integer value representing the number of fractional
              digits (those after the decimal  delimiter)  to  be
              displayed  in  a  formatted monetary quantity using
              the  int_curr_symbol  value  Specifies  an  integer
              value  representing the number of fractional digits
              (those after the decimal delimiter) to be displayed
              in  a  formatted  monetary  quantity using the currency_symbol
 value Specifies an integer value indicating
 whether the int_curr_symbol or currency_symbol
 string precedes or follows the value for a nonnegative-formatted
 monetary quantity

              The  following integer values are recognized: Indicates
 that the currency symbol follows the monetary
              quantity  Indicates  that  the currency symbol precedes
 the monetary quantity  Specifies  an  integer
              value  indicating  whether  the  int_curr_symbol or
              currency_symbol string is separated by a space from
              a nonnegative-formatted monetary quantity

              The   following   integer  values  are  recognized:
              Indicates that no space separates the currency symbol
  from  the  monetary  quantity Indicates that a
              space separates the currency symbol from the  monetary
  quantity Indicates that a space separates the
              currency symbol and the  positive_sign  string,  if
              adjacent  Specifies  an  integer  value  indicating
              whether  the  int_curr_symbol  or   currency_symbol
              string  precedes  or  follows the value for a negative-formatted
 monetary quantity

              The following integer values are recognized:  Indicates
 that the currency symbol follows the monetary
              quantity Indicates that the  currency  symbol  precedes
  the  monetary  quantity Specifies an integer
              value indicating  whether  the  int_curr_symbol  or
              currency_symbol string is separated by a space from
              a negative-formatted monetary quantity

              The following integer values are recognized:  Indicates
  that  no space separates the currency symbol
              from the monetary quantity Indicates that  a  space
              separates  the  currency  symbol  from the monetary
              quantity Indicates that a space separates the  currency
 symbol and the negative_sign string, if adjacent
 Specifies  an  integer  value  indicating  the
              positioning  of the positive_sign string for a nonnegative-formatted
 monetary quantity

              The following integer values are recognized:  Indicates
 that a left_parenthesis and right_parenthesis
              symbol enclose both the monetary quantity  and  the
              int_curr_symbol or currency_symbol string Indicates
              that the positive_sign string precedes the quantity
              and  the  int_curr_symbol or currency_symbol string
              Indicates that the positive_sign string follows the
              quantity and the int_curr_symbol or currency_symbol
              string  Indicates  that  the  positive_sign  string
              immediately  precedes  the  int_curr_symbol or currency_symbol
  string  Indicates  that   the   positive_sign
    string    immediately    follows   the
              int_curr_symbol or currency_symbol string Specifies
              an  integer value indicating the positioning of the
              negative_sign string for a negative-formatted monetary
 quantity

              The  following integer values are recognized: Indicates
 that a left_parenthesis and right_parenthesis
              symbol  enclose  both the monetary quantity and the
              int_curr_symbol or currency_symbol string Indicates
              that the negative_sign string precedes the quantity
              and the int_curr_symbol or  currency_symbol  string
              Indicates that the negative_sign string follows the
              quantity and the int_curr_symbol or currency_symbol
              string  Indicates  that  the  negative_sign  string
              immediately precedes the  int_curr_symbol  or  currency_symbol
   string   Indicates  that  the  negative_sign
   string    immediately    follows    the
              int_curr_symbol or currency_symbol string Specifies
              the string used for the debit symbol (DB) to  indicate
 a negative-formatted monetary quantity

              The  debit_sign  keyword  is  an  extension  to the
              X/Open Portability Guide and may not be portable to
              all  systems that conform to that standard.  Specifies
 the string used for the credit symbol (CR)  to
              indicate  a nonnegative-formatted monetary quantity
              The credit_sign keyword  is  an  extension  to  the
              X/Open Portability Guide and may not be portable to
              all systems that conform to that standard.   Specifies
  the character, equivalent to a ( (left parenthesis),
 used by the  p_sign_posn  and  n_sign_posn
              statements  to enclose a monetary quantity and currency
 symbol

              The left_parenthesis keyword is an extension to the
              X/Open Portability Guide and may not be portable to
              all systems that conform to that standard.   Specifies
 the character, equivalent to a ) (right parenthesis),
 used by the  p_sign_posn  and  n_sign_posn
              statements  to enclose a monetary quantity and currency
 symbol

              The right_parenthesis keyword is  an  extension  to
              the   X/Open  Portability  Guide  and  may  not  be
              portable to all systems that conform to that  standard.


       A  unique  customized  monetary  format can be produced by
       changing the value of a single  statement.   For  example,
       the  following table shows the results of using all combinations
  of  defined   values   for   the   p_cs_precedes,
       p_sep_by_space, and p_sign_posn statements:

       --------------------------------------------------------------------
                           p_sep_by_space =   2         1          0
       --------------------------------------------------------------------
       p_cs_precedes = 1   p_sign_posn = 0    ($1.25)   ($ 1.25)   ($1.25)
                           p_sign_posn = 1    + $1.25   +$ 1.25    +$1.25
                           p_sign_posn = 2    $1.25 +   $ 1.25+    $1.25+
                           p_sign_posn = 3    + $1.25   +$ 1.25    +$1.25
                           p_sign_posn = 4    $ +1.25   $+ 1.25    $+1.25
       p_cs_precedes = 0   p_sign_posn = 0    (1.25$)   (1.25 $)   (1.25$)
                           p_sign_posn = 1    +1.25 $   +1.25 $    +1.25$
                           p_sign_posn = 2    1.25$ +   1.25 $+    1.25$+
                           p_sign_posn = 3    1.25+ $   1.25 +$    1.25+$
                           p_sign_posn = 4    1.25$ +   1.25 $+    1.25$+
       --------------------------------------------------------------------

       The following is an example of a possible LC_MONETARY category
 in a locale definition source file:

       LC_MONETARY  #  int_curr_symbol          "<U><S><D>"  currency_symbol
           "<dollar-sign>"   mon_decimal_point
       "<period>" mon_thousands_sep       "<comma>"  mon_grouping
       <3>  positive_sign            "<plus-sign>"  negative_sign
       "<hyphen>"   int_frac_digits            <2>    frac_digits
       <2>     p_cs_precedes               <1>     p_sep_by_space
       <2>     n_cs_precedes               <1>     n_sep_by_space
       <2>      p_sign_posn                  <3>      n_sign_posn
       <3>    debit_sign                 "<D><B>"     credit_sign
       "<C><R>"    left_parenthesis          "<left-parenthesis>"
       right_parenthesis         "<right-parenthesis>"   #    END
       LC_MONETARY










   The LC_NUMERIC Category    [Toc]    [Back]
       The LC_NUMERIC category of a locale definition source file
       defines  rules  and  symbols  for  formatting  nonmonetary
       numeric   information.    This  category  begins  with  an
       LC_NUMERIC category header  and  terminates  with  an  END
       LC_NUMERIC category trailer.

       All  operands  for  the  LC_NUMERIC  category keywords are
       defined as string or integer values.   String  values  are
       bounded  by " " (double quotes).  All values are separated
       from the keyword they define by one or more blank  characters
  (spaces  or tabs). Two adjacent double quote characters
 ("") indicate an undefined string value.  A -1 (negative
 one) indicates an undefined integer value.

       The  following  keywords  are recognized in the LC_NUMERIC
       category: Specifies the name of an existing locale  to  be
       used as the definition of this category

              If  you  include a copy statement, no other keyword
              will be specified.  Specifies the decimal delimiter
              string  used  to format nonmonetary numeric quantities


              This keyword cannot be omitted and cannot be set to
              the  undefined  string value.  Specifies the string
              separator used for grouping digits to the  left  of
              the  decimal  delimiter  in  formatted  nonmonetary
              numeric quantities Defines the size of  each  group
              of digits in formatted monetary quantities

              The  operand for the grouping keyword consists of a
              sequence  of  semicolon-separated  integers.   Each
              integer  specifies the number of digits in a group.
              The initial integer defines the size of  the  group
              immediately  to  the left of the decimal delimiter.
              The subsequent integers define succeeding groups to
              the  left  of the previous group.  Grouping is performed
 for each integer specified for the  grouping
              keyword.   If  the last integer is not -1, the size
              of the last integer is repeatedly used to group any
              remaining  digits.   If  the last integer is -1, no
              more grouping is performed.

       The following is an example of the interpretation  of  the
       grouping statement.  Assuming the value to be formatted is
       123456789 and the operand for the thousands_sep keyword is
       '  (single  quote), the following results occur: Formatted
       Value 123456'789 123'456'789 1234'56'789 12'34'56'789

       The following is an example of a possible LC_NUMERIC category
 listed in a locale definition source file:

       LC_NUMERIC   #  decimal_point    "<period>"  thousands_sep
       "<comma>" grouping        <3> # END LC_NUMERIC


   The LC_TIME Category    [Toc]    [Back]
       The LC_TIME category of a locale  definition  source  file
       defines  rules  and  symbols  for formatting time and date
       information.  This category begins with an  LC_TIME  category
  header  and  terminates with an END LC_TIME category
       trailer.

       All operands for the LC_TIME category keywords are defined
       as string or integer values.  String values are bounded by
       " " (double quotes).  All values are  separated  from  the
       keyword  they  define  by  one  or  more  blank characters
       (spaces or tabs). Two adjacent double quote characters  ()
       indicate an undefined string value.  Field descriptors are
       used by commands and subroutines that  query  the  LC_TIME
       category  to  represent elements of time and date formats.
       The field descriptors used  by  commands  and  subroutines
       that  query  the  LC_TIME category for time formatting are
       described  in  this  section,  immediately  following  the
       descriptions of valid keywords.

       The following keywords are recognized in the LC_TIME category:
 Specifies the name of an existing locale to be  used
       as the definition of this category

              If  you  include a copy statement, no other keyword
              will be specified.  Defines the abbreviated weekday
              names corresponding to the %a field descriptor

              Recognized  values consist of 7 semicolon-separated
              strings.   The  first  string  corresponds  to  the
              abbreviated name for the first day of the week (for
              example, Sun), the second to the  abbreviated  name
              for the second day of the week, and so on.  Defines
              the full spelling of the weekday names  corresponding
 to the %A field descriptor

              Recognized  values consist of 7 semicolon-separated
              strings.  The first string corresponds to the  full
              spelling  of  the name of the first day of the week
              (for example, Sunday), the second to  the  name  of
              the second day of the week, and so on.  Defines the
              abbreviated month names  corresponding  to  the  %b
              field descriptor

              Recognized values consist of 12 semicolon-separated
              strings.   The  first  string  corresponds  to  the
              abbreviated  name  for  the first month of the year
              (for example, Jan), the second to  the  abbreviated
              name  for  the second month of the year, and so on.
              Defines the full spelling of the month names corresponding
 to the %B field descriptor

              Recognized values consist of 12 semicolon-separated
              strings.  The first string corresponds to the  full
              spelling   of  the  name for the first month of the
              year (for example, January), the second to the full
              spelling  of  the  name for the second month of the
              year, and so on.  Defines the string used  for  the
              standard  date-and-time format corresponding to the
              %c field descriptor

              The string can contain any combination  of  characters
  and  field  descriptors.   Defines the string
              used for the standard date format corresponding  to
              the %x field descriptor

              The  string  can contain any combination of characters
 and field  descriptors.   Defines  the  string
              used  for the standard time format corresponding to
              the %X field descriptor

              The string can contain any combination  of  characters
  and  field  descriptors.  Defines the strings
              used to  represent  a.m.  (before  noon)  and  p.m.
              (after noon) corresponding to the %p field descriptor


              Recognized values consist  of  two  semicolon-separated
 strings.  The first string corresponds to the
              a.m. designation, the last string to the p.m.  designation.
  Defines the string used for the standard
              12-hour time format that includes  an  am_pm  value
              (%p field descriptor)

              This statement corresponds to the %r field descriptor.
  The string can  contain  any  combination  of
              characters and field descriptors.  If the string is
              empty, the 12-hour format is not supported  by  the
              locale.  Defines how the years are counted and displayed
 for each era in a locale,  corresponding  to
              the %E field descriptor modifier

              For  each era, there must be one string in the following
 format:

              direction:offset:start_date:end_date:name:format

              The variables for the era string format are defined
              as follows: Specifies a - (minus) or + (plus) character


              The - character indicates that years count  in  the
              negative  direction when moving from the start date
              to the end date. The  +  character  indicates  that
              years  count  in the positive direction when moving
              from the start date to the end date.   Specifies  a
              number representing the first year of the era Specifies
 the starting date of the  era  in  yyyy/mm/dd
              format, where yyyy, mm, and dd are the year, month,
              and day, respectively, on the Gregorian calendar

              Years prior to the year AD  1  are  represented  as
              negative  numbers.   For  example, an era beginning
              March 5th in the year 100 BC would  be  represented
              as  -100/03/05.   Specifies  the ending date of the
              era in the same form used for the start_date  variable
  or  one of the two special values -* or +*. A
              -* value indicates that the ending date of the  era
              extends backward to the beginning of time

              A  +*  value  indicates that the ending date of the
              era extends forward to the end of time.  Therefore,
              the  ending  date  can be chronologically before or
              after the starting date of the era.   For  example,
              the  strings for the Christian eras AD and BC would
              be entered as follows:

              +:0:0000/01/01:+*:AD:%o %N +:1:-0001/12/31:-*:BC:%o
              %N  Specifies a string representing the name of the
              era that is substituted for the %N field descriptor
              Specifies  a  strftime()  format string to use when
              formatting the %EY field descriptor

              This string can contain any strftime() format  control
  characters  (except %EY) and locale-dependent
              multibyte characters.

              An era value consists of one  string  (enclosed  in
              quotes)  for  each  era.   If  more than one era is
              specified, each era string  is  separated  by  a  ;
              (semicolon).   Defines the string used to represent
              the year in alternate-era format  corresponding  to
              the %Ey field descriptor

              The  string  can contain any combination of characters
 and field  descriptors.   Defines  the  string
              used  to represent the date in alternate-era format
              corresponding to the %Ex field descriptor

              The string can contain any combination  of  characters
  and  field descriptors.  Defines the locale's
              alternative time format, as represented by the  %EX
              field   descriptor   for   strftime()  Defines  the
              locale's alternative date-and-time format, as  represented
 by the %Ec field descriptor for strftime()
              Defines alternate strings for digits  corresponding
              to the %O field descriptor

              Recognized  values consist of a group of semicolonseparated
 strings.  The first string represents the
              alternate  string  for  0 (zero), the second string
              represents the alternate string for 1, and  so  on.
              A  maximum  of  100 alternate strings can be specified.
  Defines the string used  to  print  out  the
              month/date/time format for some commands (ls, find,
              who, ar)

              This format corresponds to the "%b %e %H:%M" format
              for the POSIX locale.  (Optional) This format is an
              extension to the X/Open Portability Guide  and  may
              not  be  supported  on  all systems that conform to
              that standard.  Defines the string  used  to  print
              out  the  month/date/year  format for some commands
              (ls, find, who, ar)

              This format corresponds to the "%b  %e  %Y"  format
              for the POSIX locale.  (Optional) This format is an
              extension to the X/Open Portability Guide  and  may
              not  be  supported  on  all systems that conform to
              that standard.

       The LC_TIME  locale  definition  source  file  uses  field
       descriptors  to  represent  elements of time and date formats.
  Combinations  of  these  field  descriptors  create
       other  field  descriptors   or create time and date format
       strings.  When used in format strings that  contain  field
       descriptors  and  other  characters, field descriptors are
       replaced by their current values.   All  other  characters
       are  copied without change.  The  following field descriptors
 are used by commands and subroutines that  query  the
       LC_TIME  category  for  time  formatting:  Represents  the
       abbreviated weekday name (for example, Sun) defined by the
       abday  statement  Represents  the  full  weekday name (for
       example, Sunday) defined by the day  statement  Represents
       the  abbreviated  month name (for example, Jan) defined by
       the abmon statement Represents the full  month  name  (for
       example,  January)  defined  by the month statement Represents
 the date-and-time  format  defined  by  the  d_t_fmt
       statement  Represents  the century as a decimal number (00
       to 99) Represents the day of the month as a decimal number
       (01  to  31)  Represents  the date in %m/%d/%y format (for
       example, 01/31/91) Represents the day of the  month  as  a
       decimal number (1 to 31)

              The  %e  field descriptor uses a 2-digit field.  If
              the day of the month is not a 2-digit  number,  the
              leading  digit  is  filled  with a space character.
              Specifies the locale's alternate appropriate  dateand-time
  representation  Specifies the name of the
              base year (period) in the locale's alternate representation
  Specifies  the  locale's  alternate date
              representation Specifies the offset from %EC  (year
              only)  in  the  locale's  alternate  representation
              Specifies the full  alternate  year  representation
              Represents the abbreviated month name (for example,
              Jan) defined by the abmon statement

              This field descriptor is a synonym for the %b field
              descriptor  Represents  the 24-hour clock hour as a
              decimal number (00 to 23)  Represents  the  12-hour
              clock  hour  as  a decimal number (01 to 12) Represents
 the day of the year as a decimal number  (001
              to 366) Represents the month of the year as a decimal
 number (01 to 12) Represents the minutes of the
              hour  as  a  decimal  number (00 to 59) Specifies a
              newline character Represents the alternate era name
              Represents the alternate era year Specifies the day
              of  the  month  by  using  the  locale's  alternate
              numeric  symbols  Specifies the day of the month by
              using the locale's alternate numeric symbols Specifies
 the hour (24-hour clock) by using the locale's
              alternate  numeric  symbols  Specifies   the   hour
              (12-hour  clock)  by  using  the locale's alternate
              numeric symbols Specifies the month  by  using  the
              locale's  alternate  numeric  symbols Specifies the
              minutes by using  the  locale's  alternate  numeric
              symbols Specifies the seconds by using the locale's
              alternate numeric symbols Specifies the week number
              of  the  year (Sunday as the first day of the week)
              by using the  locale's  alternate  numeric  symbols
              Specifies  the  weekday as a number in the locale's
              alternate representation (Sunday = 0) Specifies the
              week number of the year (Monday as the first day of
              the week) by using the locale's  alternate  numeric
              symbols  Specifies  the  year  (offset  from %C) in
              alternate representation  Represents  the  a.m.  or
              p.m.   string defined by the am_pm statement Represents
 the 12-hour clock time with  a.m./p.m.  notation
  as defined by the t_fmt_ampm statement Represents
 the seconds of the minute as a decimal number
              (00  to  59)  Specifies  a tab character Represents
              24-hour clock time  in  the  format  %H:%M:%S  (for
              example,  16:55:15) Represents the week of the year
              as a decimal number (00 to 53)

              Sunday, or its equivalent as  defined  by  the  day
              statement,  is the first day of the week for calculating
 the value of this field descriptor.   Represents
 the day of the week as a decimal number (0 to
              6)

              Sunday, or its equivalent as  defined  by  the  day
              statement, is 0 (zero) for calculating the value of
              this field descriptor.  Represents the week of  the
              year as a decimal number (00 to 53)

              Monday,  or  its  equivalent  as defined by the day
              statement, is the first day of the week for  calculating
  the value of this field descriptor.  Represents
 the date format defined by the  d_fmt  statement
  Represents  the  time  format  defined by the
              t_fmt statement Represents the year of the  century
              (00  to 99) Represents the year as a decimal number
              (for example, 1989) Represents the time zone  name,
              if one can be determined (for example, EST)

              No  characters  are displayed if a time zone cannot
              be determined.

Similar pages

Name	OS	Title
localedef	Tru64	Builds a locale from locale and character map source files
whatis	IRIX	describe what a command is
whatis	OpenBSD	describe what a command is
attributes	HP-UX	describe an audio file
diskinfo	HP-UX	describe characteristics of a disk device
audeventsta	HP-UX	define and describe audit system events
glupwlcurve	IRIX	describe a piecewise linear NURBS trimming curve
gluPwlCurve	Tru64	describe a piecewise linear NURBS trimming curve
setlocale	Linux	set the current locale.
nl_langinfo	NetBSD	get locale information

newsletter delivery service

locale(4)

Contents

NAME [Toc] [Back]

DESCRIPTION [Toc] [Back]