join - Joins the lines of two files
Current syntax
join [-a file_number | -v file_number] [-e string] [-o
number.field,...] [-t character] [-1 field] [-2 field]
file1 file2
Obsolescent syntax [Toc] [Back]
[join] [-a number] [-e string] [-j number | field | number
field] [-o number.field,...] [-t character] file1 file2
The join command reads file1 and file2 and joins lines in
the files that contain common fields, or otherwise according
to the options, and writes the results to standard
output.
Interfaces documented on this reference page conform to
industry standards as follows:
join: XCU5.0
Refer to the standards(5) reference page for more information
about industry standards and associated tags.
Joins on the fieldth field of file1. Fields are decimal
integers starting with 1. Joins on the fieldth field of
file2. Fields are decimal integers starting with 1. Produces
an output line for each unpairable line found in
file1 if number is 1, or file2 if number is 2. Without
-a, join produces output only for lines containing a common
field. If both -a 1 and -a 2 are used, all unpairable
lines will be output. Replaces empty output fields with
string. Joins the two files on field of file number,
where number is 1 for file1 or 2 for file2. If you do not
specify number, join uses field in each file. Without -j,
join uses the first field in each file. The default value
for both number and field is 1. (Obsolescent)
If you enter only a 1 or a 2 as an argument to -j,
join interprets this argument as the file number
(number); integers greater than 2 are interpreted
as the field number (field). Therefore, if you
want to specify a field number of 2, you must precede
this specification with a number argument;
otherwise, the join program interprets the 2 as the
file number (number). Produces output lines consisting
of the fields specified in one or more number.field
arguments, where number is 1 for file1 or
2 for file2, and field is a field number. Multiple
-o arguments should be separated with commas. Uses
character (a single character) as the field separator
character in the input and the output. Every
appearance of character in a line is significant.
The default separator is a space. If you do not
specify -t, join also recognizes the tab and newline
characters as separators.
With default field separation, the collating
sequence is that of sort -b. If you specify -t,
the sequence is that of a plain sort. To specify a
tab character, enclose it in '' (single quotes).
Produces an output line for each unpairable line in
file_number (where file_number is 1 or 2), instead
of the default output. If both -v 1 and -v 2 are
specified, produces output lines for all unpairable
lines.
The pathnames of files to be used as input. If - (hyphen)
is specified for either file, standard input is read.
The join field is the field in the input files that join
looks at to determine what will be included in the output.
One line appears in the output for each identical join
field appearing in both file1 and file2. The output line
consists of the join field, the rest of the line from
file1, then the rest of the line from file2.
Both input files must be sorted according to the collating
sequence specified by the LC_COLLATE environment variable,
if set, for the fields where they are to be joined (usually
the first field in each line).
Fields are normally separated by a space, a tab character,
or a newline character. In this case, join treats consecutive
separators as one, and discards leading separators.
Use the -t option to specify another field separator.
The following exit values are returned: Successful completion.
An error occurred.
Note that the vertical alignment shown in these examples
may not be consistent with your output. To perform a simple
join operation on two files, phonedir and names, whose
first fields are the same, enter: join phonedir names
If phonedir contains the following telephone directory:
Binst 555-6235 Dickerson 555-1842
Eisner 555-1234 Green 555-2240
Hrarii 555-0256 Janatha 555-7358
Lewis 555-3237 Takata 555-5341
Wozni 555-1234
and names is this listing of names and department
numbers:
Eisner Dept. 389 Frost Dept. 217
Green Dept. 311 Takata Dept. 454
Wozni Dept. 520
then join phonedir names displays: Eisner
555-1234 Dept. 389 Green 555-2240
Dept. 311 Takata 555-5341 Dept. 454
Wozni 555-1234 Dept. 520
Each line consists of the join field (the last
name), followed by the rest of the line found in
phonedir and the rest of the line in names. To
display unmatched lines as well as matched lines,
enter: join -a 2 phonedir names
If phonedir contains:
Binst 555-6235 Dickerson 555-1842
Eisner 555-1234 Green 555-2240
Hrarii 555-0256 Janatha 555-7358
Lewis 555-3237 Takata 555-5341
Wozni 555-1234
and names contains:
Eisner Dept. 389 Frost Dept. 217
Green Dept. 311 Takata Dept. 454
Wozni Dept. 520
then join -a 2 phonedir names displays: Eisner
555-1234 Dept. 389 Frost
Dept. 217 Green 555-2240 Dept. 311
Takata 555-5341 Dept. 454 Wozni
555-1234 Dept. 520
This performs the same join operation as in the
first example, and also lists the lines of names
that have no match in phonedir. It includes Frost's
name and department number in the listing, although
there is no entry for Frost in phonedir. To display
selected fields, enter: join -o 2.3,2.1,1.2
phonedir names
This displays the following fields:
Field 3 of names (Department Number)
Field 1 of names (Last Name)
Field 2 of phonedir (Telephone Number)
If phonedir contains:
Binst 555-6235 Dickerson 555-1842
Eisner 555-1234 Green 555-2240
Hrarii 555-0256 Janatha 555-7358
Lewis 555-3237 Takata 555-5341
Wozni 555-1234
and names contains:
Eisner Dept. 389 Frost Dept. 217
Green Dept. 311 Takata Dept. 454
Wozni Dept. 520
then join -o 2.3,2.1,1.2 phonedir names displays:
389 Eisner 555-1234 311 Green 555-2240
454 Takata 555-5341 520 Wozni 555-1234
To perform the join operation on a field other than
the first, enter: sort -b -k 2,3 phonedir | join -1
2 - numbers
This combines the lines in phonedir and names, comparing
the second field of phonedir to the first
field of numbers.
First, this sorts phonedir by the second field
because both files must be sorted by their join
fields. The output of sort is then piped to join.
The - (dash) by itself causes the join command to
use this output as its first file. The -1 2 defines
the second field of the sorted phonedir as the join
field. This is compared to the first field of numbers
because its join field is not specified with a
-2 option.
If phonedir contains:
Binst 555-6235 Dickerson 555-1842
Eisner 555-1234 Green 555-2240
Hrarii 555-0256 Janatha 555-7358
Lewis 555-3237 Takata 555-5341
Wozni 555-1234
and numbers contains:
555-0256 555-1234 555-5555 555-7358
then sort ... | join ... displays: 555-0256
Hrarii 555-1234 Eisner 555-1234 Wozni
555-7358 Janatha
Each number in numbers is listed with the name
listed in phonedir for that number. Note that join
lists all the matches for a given field. In this
case, join lists both Eisner and Wozni as having
the telephone number 555-1234. The number 555-5555
is not listed because it does not appear in
phonedir.
ENVIRONMENT VARIABLES [Toc] [Back] The following environment variables affect the execution
of join: Provides a default value for the internationalization
variables that are unset or null. If LANG is unset
or null, the corresponding value from the default locale
is used. If any of the internationalization variables
contain an invalid setting, the utility behaves as if none
of the variables had been defined. If set to a non-empty
string value, overrides the values of all the other internationalization
variables. Determines the locale for the
interpretation of sequences of bytes of text data as characters
(for example, single-byte as opposed to multi-byte
characters in arguments and input files). Determines the
locale for the format and contents of diagnostic messages
written to standard error. Determines the location of
message catalogues for the processing of LC_MESSAGES.
Commands: awk(1), cmp(1), comm(1), cut(1), diff(1),
grep(1), paste(1), sdiff(1), sed(1), sort(1), uniq(1)
Standards: standards(5)
join(1)
[ Back ] |