iconv_JEF - Tru64

· Home

+ man pages

-> Linux

-> FreeBSD

-> OpenBSD

-> NetBSD

-> Tru64 Unix

-> HP-UX 11i

-> IRIX

· Linux HOWTOs

· FreeBSD Tips

· *niX Forums

man pages->Tru64 Unix man pages -> iconv_JEF (5)

iconv_JEF(5)

NAME [Toc] [Back]

       iconv_JEF   -  Specification  for  controlling  conversion
       between Fujitsu JEF and Tru64 UNIX Japanese codesets

DESCRIPTION [Toc] [Back]

       The iconv utility supports  the  ability  to  convert  the
       encoding  of characters between Fujitsu JEF (Japanese processing
 Extended Feature) code and one  of  the  following
       Tru64  UNIX codesets: DEC Kanji, Super DEC Kanji, Japanese
       EUC, or Shift JIS. You choose the type  of  conversion  by
       specifying  the appropriate values for the utility's fromcode
 and to-code parameters, as follows:

       ------------------------------------------------
       Type of Code Conversion   from-code   to-code
       ------------------------------------------------
       JEF to DEC Kanji          JEF         deckanji
       JEF to Super DEC Kanji    JEF         sdeckanji
       JEF to Japanese EUC       JEF         eucJP
       JEF to Shift JIS          JEF         SJIS
       DEC Kanji to JEF          deckanji    JEF
       Super DEC Kanji to JEF    sdeckanji   JEF
       Japanese EUC to JEF       eucJP       JEF
       Shift JIS to JEF          SJIS        JEF
       ------------------------------------------------

       Conversion behavior for the following items is affected by
       the definition of environment variables or profile entries
       in the user's environment. For more information,  see  the
       "Environment  Variables"  and "Profile" sections.  The UDC
       (User-Defined Character) mapping table that  is  used  for
       UDC conversion

              This table must be an ASCII text file that contains
              UDC mapping information.  The table affects conversion
  of  user-defined characters between the codesets.
  The EBCDIC  to/from  ISO  code  (ASCII,  JIS
              Roman  characters)  mapping  table that is used for
              conversion

              This table must be ASCII text  file  that  contains
              information on how to map characters between EBCDIC
              and ISO code.  The K-shift code

              This is a one- or two-byte  hexadecimal  code  that
              marks  the  beginning  of  Kanji mode.  The A-shift
              code

              This is a one- or two-byte  hexadecimal  code  that
              marks  the beginning of EBCDIC mode.  The status of
              the initial mode (Kanji  or  EBCDIC)  at  the  time
              iconv  command starts or the first time the iconv()
              function is called after calling  the  iconv_open()
              function  that  initializes the converter in a program


              The  status  keywords  are  either  kanji_mode   or
              ebcdic_mode.   How  to  treat  undefined characters
              when these are detected in Kanji mode

              Specify this action by using one of  the  following
              keywords:  Stop  codeset  conversion.   Output  the
              undefined characters  without  any  processing  and
              continue  codeset conversion.  Output padding characters
 instead of the undefined characters and continue
  codeset  conversion.   Ignore  the undefined
              characters and continue  codeset  conversion.   The
              two-byte padding character used in Kanji mode

              This value is meaningful when replace is chosen for
              the processing of  undefined  characters  in  Kanji
              mode. Specify the padding character by its hexadecimal
 value.  How to treat undefined characters when
              these are detected in EBCDIC mode

              Specify  this  action by using one of the following
              keywords:  Stop  codeset  conversion.   Output  the
              undefined  characters  without  any  processing and
              continue codeset conversion.  Output padding  characters
 instead of the undefined characters and continue
 codeset  conversion.   Ignore  the  undefined
              characters  and  continue  codeset conversion.  The
              one-byte padding character used in EBCDIC mode

              This value is meaningful when replace is chosen for
              the  processing  of  undefined characters in EBCDIC
              mode. Specify the padding character by its hexadecimal
 value.

       When  the to-code parameter for the conversion is JEF, you
       can also specify the following items for conversion behavior:
 Whether the initial shift code is output at the start
       of conversion if the status of the initial mode (Kanji  or
       EBCDIC)  is  different  from  the  mode of the first input
       character

              The start of conversion is the time the iconv utility
 starts processing, or when the iconv() function
              is called just after  opening  the  converter  with
              iconv_open().  Keyword values for this item are yes
              or no.  Whether or not the utility outputs the last
              shift  code  when  iconv()  is  called  with a zero
              length input string, and the current mode (Kanji or
              EBCDIC) is different from the mode specified by the
              last shift state

              Keyword values for this item are yes  or  no.   The
              last status (Kanji mode or EBCDIC mode)

              Specify  kanji_mode  or ebcdic_mode for this value.
              It is meaningful only when yes is the  setting  for
              whether the utility outputs the last shift code.

       If  the  items that control conversion behavior are specified
 by both environment variables and the  profile  file,
       values set by environment variables override values set by
       comparable entries in the profile. Note  that  values  for
       all  conversion  control items are case-sensitive, whether
       they are set by environment variables or in  the  profile.
       The  following  table contains the default values for each
       conversion control item:







       ----------------------------------------------------
       Conversion Control Item               Default Value
       ----------------------------------------------------
       UDC mapping table                     None
       K shift code                          0x28
       A shift code                          0x29
       Initial state                         ebcdic_mode
       Processing for undefined characters
       in Kanji mode                         abort
       Processing for undefined characters
       in EBCDIC mode                        pass
       ----------------------------------------------------

       The default padding characters  are  white  spaces,  whose
       code  values for each destination codeset are noted in the
       following table. These padding characters are output  when
       you specify replace for processing of undefined characters
       and do not explicitly specify the padding character.

       ---------------------------------------------------
       Mode          Default Value   Destination Codeset
       ---------------------------------------------------
       Kanji mode    0x4040          JEF
                     0xa1a1          deckanji, sdeckanji,
                                     or eucJP
                     0x8140          SJIS
       EBCDIC mode   0x40            JEF
                     0x20            deckanji, sdeckanji,
                                     eucJP, or SJIS
       ---------------------------------------------------

       The default EBCDIC-ISO mapping table is  as  follows;  For
       conversion     from     JEF     to     other     codesets:
       /usr/lib/nls/loc/iconv/data/kana_ebcdic.tbl For conversion
       from         other         codesets         to        JEF:
       /usr/lib/nls/loc/iconv/data/kana_ebcdic.tbl

       These mapping tables map both EBCDIC and ISO  code,  which
       includes JIS Roman characters. The kana_ebcdic.tbl mapping
       table also maps ISO lowercase characters to EBCDIC  uppercase
 characters.

       The  following default values for conversion control items
       are meaningful when the iconv utility's to-code conversion
       parameter is JEF:

       ---------------------------------------------
       Conversion Control Item          Default
       ---------------------------------------------
       Output the initial shift code?   yes
       Output the last shift code?      yes
       Output the last status?          ebcdic_mode
       ---------------------------------------------


   Environment Variables    [Toc]    [Back]
       This  section discusses the environment variables that you
       can set to control  conversion  behavior.  The  names  for
       these variables adhere to the following format:

       fromcode_tocode_controlitem

       The name segments for fromcode or tocode can be one of the
       following key words:


       ----------------------------
       For Codeset:      Use:
       ----------------------------
       Fujitsu JEF       JEF
       DEC Kanji         DECKANJI
       Super DEC Kanji   SDECKANJI
       Japanese EUC      EUCJP
       Shift JIS         SJIS
       ----------------------------

       The name segments for controlitem can be one of  the  following
 keywords:

       --------------------------------------------------------
       For Control Item:                    Use:
       --------------------------------------------------------
       UDC mapping table                    UDC_TABLE
       EBCDIC-ISO mapping table             EBCDIC_TABLE
       K shift code                         K_SHIFT_CODE
       A shift code                         A_SHIFT_CODE
       Initial state                        INITIAL_STATE
       Processing of undefined characters
       in Kanji mode                        KANJI_EXCEPT_PROC
       Processing of undefined characters
       in EBCDIC mode                       EBCDIC_EXCEPT_PROC
       Padding characters
       in Kanji mode                        PADDING_2BYTE_CHAR
       Padding characters
       in EBCDIC mode                       PADDING_1BYTE_CHAR
       Output initial
       shift code                           INITIAL_SHIFT_CODE
       Output last
       shift code                           TRAILER_SHIFT_CODE
       Last status                          LAST_STATE
       File path of the profile             PROFILE
       --------------------------------------------------------

       Following are examples of using the setenv C shell command
       to define  environment  variables  to  control  conversion
       behavior.  In  these  examples,  the fromcode name segment
       indicates Japanese EUC and the tocode name  segment  indicates
 JEF:

       setenv    EUCJP_JEF_UDC_TABLE   eucjp_jef_udc.tbl   setenv
       EUCJP_JEF_EBCDIC_TABLE       ebcdic_kana.tbl        setenv
       EUCJP_JEF_K_SHIFT_CODE  0x28 setenv EUCJP_JEF_A_SHIFT_CODE
       0x29  setenv  EUCJP_JEF_INITIAL_STATE  ebcdic_mode  setenv
       EUCJP_JEF_KANJI_EXCEPT_PROC         replace         setenv
       EUCJP_JEF_EBCDIC_EXCEPT_PROC        replace         setenv
       EUCJP_JEF_PADDING_2BYTE_CHAR         0x4040         setenv
       EUCJP_JEF_PADDING_1BYTE_CHAR  0x40  setenv  EUCJP_JEF_INITIAL_SHIFT_CODE
  yes  setenv  EUCJP_JEF_TRAILER_SHIFT_CODE
       yes   setenv   EUCJP_JEF_LAST_STATE   ebcdic_mode   setenv
       EUCJP_JEF_INITIAL_SHIFT_CODE           yes          setenv
       EUCJP_JEF_TRAILER_SHIFT_CODE          yes           setenv
       EUCJP_JEF_LAST_STATE  ebcdic_mode setenv EUCJP_JEF_PROFILE
       .eucjp_jef_profile










   Directory Search Path    [Toc]    [Back]
       When you specify a file  name  without  a  directory,  the
       iconv  utility searches the following directories and uses
       the first file found: Current directory Home directory The
       iconv/data  subdirectory of the directory specified by the
       environment variable  LOCPATH  /usr/lib/nls/loc/iconv/data
       /usr/i18n/lib/nls/loc/iconv/data

       If  you  specify a relative directory path for a file, the
       utility searches these same directories in the same  order
       and uses the first file found.

   Profile File    [Toc]    [Back]
       Entry  lines  in  the profile file adhere to the following
       format:

       entry_name        string_value

       The entry_name and string_value fields  are  separated  by
       spaces   or   tabs.  Do  not  append  a  colon  (:)  after
       entry_name. The file can also include blank lines and comment
 entries, which begin with the # character.

       Following  are the entry_name values for different conversion
 control items:

       ------------------------------------------------------------
       Conversion Control Item           entry_name
       ------------------------------------------------------------
       UDC mapping table                 udc_mapping_table
       EBCDIC-ISO mapping table          ebcdic_mapping_table
       K shift code                      k_shift_code
       A shift code                      a_shift_code
       Initial state                     initial_state
       Processing undefined characters
       in Kanji mode                     kanji_except_proc
       Processing undefined characters
       in EBCDIC mode                    ebcdic_except_proc
       Padding character
       in Kanji mode                     padding_2byte_char
       Padding character
       in EBCDIC mode                    padding_1byte_char
       Output initial
       shift code                        output_initial_shift_code
       Output last
       shift code                        output_trailer_shift_code
       Last state                        last_state
       ------------------------------------------------------------

       Following is a sample profile for converting from Japanese
       EUC to Fujitsu JEF:.

       #  #   sample  profile  for  eucJP_JEF # udc_mapping_table
       eucjp_jef_udc.tbl                     ebcdic_mapping_table
       kana_ebcdic.tbl    k_shift_code                       0x28
       # ebcdic  ->  kanji  a_shift_code                     0x29
       #        kanji        ->        ebcdic       initial_state
       ebcdic_mode    kanji_except_proc                   replace
       ebcdic_except_proc              replace padding_2byte_char
       0x4040            #    kanji    mode    padding_1byte_char
       0x40             #  ebcdic  mode output_initial_shift_code
       yes   output_trailer_shift_code          yes    last_state
       ebcdic_mode

       The default file names for the profile are as follows;

       ------------------------------------------------
       Code Conversion          Default Profile Name
       ------------------------------------------------
       JEF to DEC Kanji         .jef_deckanji_profile
       JEF to Super DEC Kanji   .jef_sdeckanji_profile
       JEF to Shift JIS         .jef_sjis_profile
       JEF to Japanese EUC      .jef_eucjp_profile
       DEC Kanji to JEF         .deckanji_jef_profile
       Super DEC Kanji to JEF   .sdeckanji_jef_profile
       Shift JIS to JEF         .sjis_jef_profile
       Japanese EUC to JEF      .eucjp_jef_profile
       ------------------------------------------------

       By  default, the iconv utility checks the directory search
       path mentioned in the "Directory Search Path" section  and
       uses  the  first  profile  it finds. However, you can also
       specify an arbitrary file path for your profile instead of
       the  default  names  by defining the following environment
       variables:

       -----------------------------------------------------------
       Code Conversion          Profile Path Environment Variable
       -----------------------------------------------------------
       JEF to DEC Kanji         JEF_DECKANJI_PROFILE
       JEF to Super DEC Kanji   JEF_SDECKANJI_PROFILE
       JEF to Shift JIS         JEF_SJIS_PROFILE
       JEF to Japanese EUC      JEF_EUCJP_PROFILE
       DEC Kanji to JEF         DECKANJI_JEF_PROFILE
       Super DEC Kanji to JEF   SDECKANJI_JEF_PROFILE
       Shift JIS to JEF         SJIS_JEF_PROFILE
       Japanese EUC to JEF      EUCJP_JEF_PROFILE
       -----------------------------------------------------------


   UDC Mapping Table    [Toc]    [Back]
       Entries in a UDC mapping table  adhere  to  the  following
       format:

       fromcode      tocode

       Each  of these values is a two-byte hexadecimal number. In
       the case of Super DEC Kanji and Japanese  EUC,  three-byte
       hexadecimal  values  that  begin  with SS3 (0x8f), such as
       0x8fxxxx, are also valid.

       You can specify ranges of UDC from and to  values  in  the
       same  file  entry  by using a hyphen to separate the codes
       that start and end each range:

       start_fromcode-end_fromcode   start_tocode-end_tocode

       When specifying entries that include ranges of values, the
       number  of  codes  in the from range must always equal the
       number of codes in the to range. A UDC mapping  table  can
       also  include  blank  lines and comment lines, which begin
       with the # character. Following is an  example  of  a  UDC
       mapping table:

       # JEF                   eucJP

       0x80a1-0x89fe             0xf5a1-0xfefe             #  udc
       0x8aa1-0x93fe            0x8ff5a1-0x8ffefe         #   udc
       0x94a1-0x99fe             0x8feea1-0x8ff3fe         #  udc
       0x9aa1-0x9afe           0x8ff4a1-0x8ff4fe       # udc

       The first entry in this file specifies a range of JEF values
  from 0x80a1 to 0x89fe that are mapped to Japanese EUC
       code values in the range 0xf5a1 to 0xfefe.  You  can  find
       additional   sample   UDC   mapping  table  files  in  the
       /usr/i18n/examples/iconv/data directory.

   EBCDIC-ISO Mapping Table    [Toc]    [Back]
       Entries in an EBCDIC-ISO mapping table adhere to the  following
 format:

       fromcode       tocode

       Each  code is a one-byte hexadecimal number. You can specify
 a range of character codes as follows:

       start_fromcode-end_fromcode     start_tocode-end_tocode

       When using the range format, the number of hex  values  in
       the  from range must be the same as the number of hex values
 in the to range.

       The EBCDIC-/ISO mapping table can also include blank lines
       and comment entries, which begin with the # character.

       Following is an example of EBCDIC-ISO code mapping table:

       # EBCDIC                Kana

       0x40                      0x20              #  space  0x4f
       0x21              #   '!'   0x7f                      0x22
       # '"'
         .                       .
         .                       .
         .                                 .            0xc1-0xc9
       0x41-0x49           #     'A'     -     'I'      0xd1-0xd9
       0x4a-0x52            #     'J'     -     'R'     0xe2-0xe9
       0x53-0x5a       # 'S' - 'Z'
         .                       .
         .                       .
         .                       .

       In this example, the first column of values are from codes
       and  the  second column of values are to codes.  The first
       three value entry lines specify mapping for single characters,
  whereas  the  last  three value entry lines specify
       mapping for ranges of characters.  You can find additional
       sample     EBCDIC-ISO     mapping     tables     in    the
       /usr/i18n/lib/nls/loc/iconv/data directory.

NOTES [Toc] [Back]

       This reference page contains  code  conversion  specifications
  that  apply  only to conversion between Fujitsu JEF
       code and the DEC Kanji, Super DEC Kanji, Japanese EUC, and
       Shift  JIS  codesets.  Refer to iconv_ibmkanji(5) for code
       conversion specifications between IBM Kanji System characters
 and the DEC Kanji, Super DEC Kanji, Japanese EUC, and
       Shift JIS codesets. Refer to iconv_KEIS(5) for  code  conversion
 specifications between Hitachi KEIS characters and
       the DEC Kanji, Super DEC Kanji, Japanese  EUC,  and  Shift
       JIS  codesets.   Refer  to  iconv_intro(5) for information
       about conversion  between  DEC  Kanji,  Super  DEC  Kanji,
       Japanese EUC, Shift JIS, and other Tru64 UNIX codesets.

iconv_JEF(5)

Contents

NAME [Toc] [Back]

DESCRIPTION [Toc] [Back]

NOTES [Toc] [Back]

SEE ALSO [Toc] [Back]