Use comm to Compare Files
Today I learned about comm, which is used to select the common lines in two files. It's pretty neat, but has a strange output format.
Say we have two text files:
# first.txt
one
two
three
# second.txt
one
three
four
We can run the below command to find the common lines. In the output, the first column is what's only in first.txt, the second column is what's in second.txt, and the third column is what's common.
$ comm first.txt second.txt
one
three
four
two
three
Hmm, that doesn't look right - one and three are common, not just one. The caveat with comm is that the files need to be sorted lexically. You can sort easily in bash with bird beak notation for process substitution.
$ comm <(sort first.txt) <(sort second.txt)
four
one
three
two
If we only want the common lines, we can apply the flags -12 to hide the first and second columns:
$ comm -12 <(sort first.txt) <(sort second.txt)
one
three
Tweet