tr(1) tr(1)
NAME [Toc] [Back]
tr - translate characters
SYNOPSIS [Toc] [Back]
tr [-Acs] string1 string2
tr -s [-Ac] string1
tr -d [-Ac] string1
tr -ds [-Ac] string1 string1
DESCRIPTION [Toc] [Back]
tr copies the standard input to the standard output with substitution
or deletion of selected characters. Input characters from string1 are
replaced with the corresponding characters in string2. If necessary,
string1 and string2 can be quoted to avoid pattern matching by the
shell.
tr recognizes the following command line options:
-A Translates on a byte-by-byte basis. When this flag
is specified tr does not support extended
characters.
-c Complements the set of characters in string1,
which is the set of all characters in the current
character set, as defined by the current setting
of LC_CTYPE, except for those actually specified
in the string1 argument. These characters are
placed in the array in ascending collation
sequence, as defined by the current setting of
LC_COLLATE.
-d Deletes all occurrences of input characters or
collating elements found in the array specified in
string1.
If -c and -d are both specified, all characters
except those specified by string1 are deleted. The
contents of string2 are ignored, unless -s is also
specified. Note, however, that the same string
cannot be used for both the -d and the -s flags;
when both flags are specified, both string1 (used
for deletion) and string2 (used for squeezing) are
required.
If -d is not specified, each input character or
collating element found in the array specified by
string1 is replaced by the character or collating
element in the same relative position specified by
Hewlett-Packard Company - 1 - HP-UX 11i Version 2: August 2003
tr(1) tr(1)
string2.
-s Replaces any character specified in string1 that
occurs as a string of two or more repeating
characters as a single instance of the character
in string2.
If the string2 contains a character class, the
argument's array contains all of the characters in
that character class. For example:
tr -s '[:space:]'
In a case conversion, however, the string2 array
contains only those characters defined as the
second characters in each of the toupper or
tolower character pairs, as appropriate. For
example:
tr -s '[:upper:]' '[:lower:]'
The following abbreviation conventions can be used to introduce ranges
of characters, repeated characters or single-character collating
elements into the strings:
c1-c2 or Stands for the range of collating elements c1
[c1-c2] through c2, inclusive, as defined by the current
setting of the LC_COLLATE locale category.
[:class:]or Stands for all the characters belonging to the
[[:class:]] defined character class, as defined by the current
setting of LC_CTYPE locale category. The following
character class names will be accepted when
specified in string1: alnum, alpha, blank, cntrl.
digit, graph, lower, print, punct, space, upper,
or xdigit, Character classes are expanded in
collation order.
When the -d and -s flags are specified together,
any of the character class names are accepted in
string2; otherwise, only character class names
lower or upper are accepted in string2 and then
only if the corresponding character class (upper
and lower, respectively) is specified in the same
relative position in string1. Such a
specification is interpreted as a request for case
conversion.
When [:lower:] appears in string1 and [:upper:]
appears in string2, the arrays contain the
characters from the toupper mapping in the
Hewlett-Packard Company - 2 - HP-UX 11i Version 2: August 2003
tr(1) tr(1)
LC_CTYPE category of the current locale. When
[:upper:] appears in string1 and [:lower:] appears
in string2, the arrays contain the characters from
the tolower mapping in the LC_CTYPE category of
the current locale.
[=c=]or Stands for all the characters or collating
[[=c=]] elements belonging to the same equivalence class
as c, as defined by the current setting of
LC_COLLATE locale category. An equivalence class
expression is allowed only in string1, or in
string2 when it is being used by the combined -d
and -s options.
[a*n] Stands for n repetitions of a. If the first digit
of n is 0, n is considered octal; otherwise, n is
treated as a decimal value. A zero or missing n
is interpreted as large enough to extend string2-
based sequence to the length of the string1-based
sequence.
The escape character \ can be used as in the shell to remove special
meaning from any character in a string. In addition, \ followed by 1,
2, or 3 octal digits represents the character whose ASCII code is
given by those digits.
An ASCII NUL character in string1 or string2 can be represented only
as an escaped character; i.e. as \000, but is treated like other
characters and translated correctly if so specified. NUL characters
in the input are not stripped out unless the option -d "\000" is
given.
EXTERNAL INFLUENCES [Toc] [Back]
Environment Variables
LANG provides a default value for the internationalization variables
that are unset or null. If LANG is unset or null, the default value of
"C" (see lang(5)) is used. If any of the internationalization
variables contains an invalid setting, tr will behave as if all
internationalization variables are set to "C". See environ(5).
LC_ALL If set to a non-empty string value, overrides the values of all
the other internationalization variables.
LC_CTYPE determines the interpretation of text as single and/or
multi-byte characters, the classification of characters as printable,
and the characters matched by character class expressions in regular
expressions.
LC_MESSAGES determines the locale that should be used to affect the
format and contents of diagnostic messages written to standard error
and informative messages written to standard output.
Hewlett-Packard Company - 3 - HP-UX 11i Version 2: August 2003
tr(1) tr(1)
NLSPATH determines the location of message catalogues for the
processing of LC_MESSAGES.
RETURN VALUE [Toc] [Back]
tr exits with one of the following values:
0 All input was processed successfully.
>0 An error occurred.
EXAMPLES [Toc] [Back]
For the ASCII character set and default collation sequence, create a
list of all the words in file1, one per line in file2, where a word is
taken to be a maximal string of alphabetics. Quote the strings to
protect the special characters from interpretation by the shell (012
is the ASCII code for a new-line (line feed) character):
tr -cs "[A-Z][a-z]" "[\012*]" <file1 >file2
Same as above, but for all character sets and collation sequences:
tr -cs "[:alpha:]" "[\012*]" <file1 >file2
Translate all lower case characters in file1 to upper case and write
the result to standard output.
tr "[:lower:]" "[:upper:]" <file1
Use an equivalence class to identify accented variants of the base
character e in file1, strip them of diacritical marks and write the
result to file2:
tr "[=e=]" "[e*]" <file1 >file2
Translate each digit in file1 to a # (number sign), and write the
result to file2.
tr "0-9" "[#*]" <file1 >file2
The * (asterisk) tells tr to repeat the # (number sign) enough times
to make the second string as long as the first one.
AUTHOR [Toc] [Back]
tr was developed by OSF and HP.
SEE ALSO [Toc] [Back]
ed(1), sh(1), ascii(5), environ(5), lang(5), regexp(5).
STANDARDS CONFORMANCE [Toc] [Back]
tr: SVID2, SVID3, XPG2, XPG3, XPG4, POSIX.2
Hewlett-Packard Company - 4 - HP-UX 11i Version 2: August 2003 [ Back ] |