tr - Translates characters
tr [-Acs] string1 string2
tr -ds [-Ac] string1 string2
tr -d [-Ac] string1
tr -s [-Ac] string1
The tr command copies characters from the standard input
to the standard output with substitution or deletion of
selected characters.
Interfaces documented on this reference page conform to
industry standards as follows:
tr: XCU5.0
Refer to the standards(5) reference page for more information
about industry standards and associated tags.
[Tru64 UNIX] Translates on a byte-by-byte basis. When
you specify this option, tr does not support extended
characters. Complements (inverts) the set of characters
in string1, which is the set of all characters in the current
character set, as defined by the current setting of
LC_CTYPE, except for those actually specified in the
string1 argument. These characters are placed in the array
in ascending collation sequence, as defined by the current
setting of LC_COLLATE. Deletes all occurrences of input
characters or collating elements found in the array specified
in string1. Replaces any character specified in
string1 that occurs as a string of two or more repeating
characters as a single instance of the character in
string2.
Translation control strings as explained in the DESCRIPTION
section.
Input characters from string1 are replaced with the corresponding
characters in string2. The tr command cannot
handle an ASCII NUL (\000) in string1 or string2; it
always deletes NUL from the input.
[Tru64 UNIX] The trbsd command is a BSD compatible version
of tr.
The following constructs can be used to specify characters
or single-character collating elements. If any of these
constructs result in multicharacter collating elements, tr
excludes those elements from the resulting array without
issuing a diagnostic. Represents itself when not
described by one of the other conventions in this list.
Represents a character by using its octal value. An octal
sequence consists of a backslash followed by the longest
sequence of one-, two-, or three-octal-digit characters
(01234567). The sequence causes the character whose encoding
is represented by the one-, two-, or three-digit octal
value to be placed in the string. Represent standard
backslash-escape sequences. No results are defined by the
Single UNIX Specification for specifying characters after
a backslash other than the ones listed here. In portable
applications, a backslash should be followed only by an
octal sequence, another backslash, or the lowercase letter
a, b, f, n, r, t, or v.
[Tru64 UNIX] On UNIX systems, you can enclose
string operands in quotation marks or specify a
backslash before some characters, such as * (an
asterisk), to remove the special meaning of those
characters to the shell. Represents a range of
collating elements between the specified range endpoints,
inclusive, as defined by the current locale
setting of the LC_COLLATE category. The starting
element, c1, must precede the ending element, c2,
in the current collation order. The characters or
collating elements in the range are placed in the
associated string in ascending collation sequence.
Note that the collation sequence for ASCII characters,
such as letters in the English alphabet, may
vary among locales. In the POSIX locale, for example,
a-z produces a string with all English lowercase
letters in English alphabetical order. However,
when LC_COLLATE is set to a different locale,
English lowercase letters may be subject to a different
collation order. Therefore, a-z may produce
a different result for locales other than the POSIX
locale. Stands for number repetitions of the character
c. The number is considered to be in decimal
unless the first digit of number is 0; then it is
considered to be in octal. This format is valid
only as string2. Represents all characters or collating
elements belonging to the equivalence class
specified by equiv, as defined by the LC_COLLATE
locale category. An equivalence class expression
can be used for string1 or string2 only when used
in combination with the -d and -s options. (For
more information, see the locale(4) reference
page.) Represents all characters belonging to the
defined character class, as defined by the current
setting of the LC_CTYPE locale category. The following
character class names are accepted when
specified in string1:
alnum cntrl lower space alpha digit print
upper blank graph punct xdigit
If the current locale defines additional keywords
(by including additional charclass definitions in
the LC_TYPE category), the tr command also recognizes
those keywords as class values.
When the -d and -s options are specified together,
any of the character class names are accepted in
string2; otherwise, only character class names
lower or upper are accepted in string2 and then
only if the class complement, (upper or lower,
respectively) is specified in the same relative
position in string1. Such a specification is
interpreted as a request for case conversion.
When [:lower:] appears in string1 and [:upper:]
appears in string2, the arrays contain the characters
from the toupper mapping in the LC_CTYPE category
of the current locale. When [:upper:] appears
in string1 and [:lower:] appears in string2, the
arrays contain the characters from the tolower mapping
in the LC_CTYPE category of the current
locale.
The first character from each mapping pair is in
the array for string1 and the second character from
each mapping pair is in the array for string2 in
the same relative position.
[Tru64 UNIX] When string2 is shorter than string1, a difference
results between historical System V and BSD systems.
A BSD system pads string2 with the last character
found in string2. Thus, it is possible to do the following:
tr 0123456789 d
[Tru64 UNIX] The preceding command translates all digits
to the letter d. A portable application cannot rely on
the BSD behavior; it would have to code the example in the
following way: tr 0123456789 '[d*]'
[Tru64 UNIX] If a given character appears more than once
in string1, the character in string2 corresponding to its
last appearance in string1 will be used in the translation.
If the -c and -d options are both specified, all characters
except those specified by string1 are deleted. The
contents of string2 are ignored, unless -s is also specified.
Note, however, that the same string cannot be used
for both the -d and the -s options; when both options are
specified, both string1 (used for deletion) and string2
(used for squeezing) are required.
If the -d option is not specified, each input character or
collating element found in the array specified by string1
is replaced by the character or collating element in the
same relative position in the array specified by string2.
When the -s option is specified, if the string2 contains a
character class, the argument's array contains all of the
characters in that character class. For example: tr -s
'[:space:]'
In a case conversion, however, the string2 array contains
only those characters defined as the second characters in
each of the toupper or tolower character pairs, as appropriate.
For example: tr -s '[:upper:]' '[:lower:]'
System V Compatibility [Toc] [Back]
[Tru64 UNIX] The root of the directory tree that contains
the commands modified for SVID 2 compliance is specified
in the file /etc/svid2_path. You can use /etc/svid2_profile
as the basis for, or to include in, your
/etc/svid2_profile reads /etc/svid2_path and sets the
first entries in the PATH environment variable so that the
modified SVID 2 commands are found first.
[Tru64 UNIX] In the SVID 2 compliant version of the tr
command, only characters in the octal range of 1 to 377
are complemented when you specify the -c option. This
behavior is accomplished because the -A option is implicitly
forced to be on when you specify the -c option.
[Tru64 UNIX] Specifying the -A option improves ASCII performance.
Despite similarities in appearance, the string
arguments used by tr are not regular expressions. The tr
command correctly processes NULL characters in its input
stream. NULL characters can be stripped using the following
command: tr -d '\000' If string1 or string2 is the
empty string, results are undefined and unpredictable.
The following exit values are returned: Successful completion.
An error occurred.
To translate braces into parentheses, enter: tr '{}' '()'
<textfile >newfile
This translates each { (left brace) to ( (left
parenthesis) and each } (right brace) to ) (right
parenthesis). All other characters remain
unchanged. In the POSIX locale, to translate lowercase
ASCII characters to uppercase, you can
enter: tr 'a-z' 'A-Z' <textfile >newfile
This command assumes that English letters are collated
in English alphabetical order, which may not
be true for locales other than the POSIX locale.
The following command is recommended for case conversion
for all locales: tr '[:lower:]' '[:upper:]'
<textfile >newfile The two strings can be of different
lengths: tr '0-9' '#' <textfile >newfile
This translates each 0 into a # (number sign) but
does not treat the digits 1 to 9; if the two character
strings are not the same length, the extra
characters in the longer one are ignored. To
translate each digit to a # (number sign), enter:
tr '0-9' '[#*]' <textfile >newfile
The * (asterisk) tells tr to repeat the # (number
sign) enough times to make the second string as
long as the first one. To translate each string of
digits to a single # (number sign), enter: tr -s
'0-9' '[#*]' <textfile >newfile In the POSIX
locale, to translate all ASCII characters that are
not specified, enter: tr -c '[ -~]' '[A-_]'
<textfile >newfile
This translates each nonprinting ASCII character to
the next following corresponding control key letter
(\001 translates to B, \002 to C, and so on).
ASCII DEL (\177), the character that follows ~
(tilde), translates to a ] (right bracket). This
command assumes that ASCII characters are collated
in a certain order, which may not be true for
locales other than the POSIX locale. To create a
list of all words in file1 one per line in file2,
where a word is taken to be a maximal string of
letters, enter: tr -cs '[:alpha:]' '[\n*]' < file1
> file2 To use an equivalence class to identify
accented variants of the base character e in file1,
which are stripped of diacritical marks and written
to file2, enter: tr '[=e=]' '[e*]' < file1 > file2
Equivalence classes are locale dependent. Some
locales may not include equivalence classes to
associate base letters and their accented variants.
ENVIRONMENT VARIABLES [Toc] [Back] The following environment variables affect the execution
of tr: Provides a default value for the internationalization
variables that are unset or null. If LANG is unset or
null, the corresponding value from the default locale is
used. If any of the internationalization variables contain
an invalid setting, the utility behaves as if none of
the variables had been defined. If set to a non-empty
string value, overrides the values of all the other internationalization
variables. Determines the locale for the
behavior of range expressions and equivalence classes.
Determines the locale for the interpretation of sequences
of bytes of text data as characters (for example, singlebyte
as opposed to multibyte characters in arguments) and
the behavior of character classes. Determines the locale
for the format and contents of diagnostic messages written
to standard error. Determines the location of message
catalogues for the processing of LC_MESSAGES.
Commands: ed(1), ksh(1), sed(1), Bourne shell sh(1b),
POSIX shell sh(1p), trbsd(1)
Files: ascii(5)
Standards: standards(5)
tr(1)
[ Back ] |