Comparing two files on UNIX

There is the well-known diff command, but there are at least two more ways to compare the contents of files.

First, Let’s create three sample files to compare to:

$ cat file1

1

2

3

4

5

6

$ cat file2

1

two

three

4

5

6

$ cat file3

1

2

3

4

5

6

#

I created file1 and file3 with the same contents, however test2 differs from these.  Let’s test out the comparison commands:

$ diff file1 file2

2,3c2,3

< 2

< 3

> two

> three
$ diff file1 file3

$

Our first case is the diff command. In the case of equality there is no output, and only the differring parts were displayed. Do you understand all these strange outputs above? These are just plain commands for the good old ed tool, with these you could generate the file file2 from file file1. There is also a parameter -e to generate a scriptlet for ed, see the man 1 diff. If you don’t need the output and just wanted to check then redirect all from StdOut to /dev/null, and check the exit value. If it’s non-zero then the two are differing.

Now, there is another very usefull command to compare 2 files, with a little more user-friendlier output, it calls sdiff, this is aside-by-side difference program.

$ sdiff file1 file2

1                                                                  1

2                                                               |  two

3                                                               |  three

4                                                                  4

5                                                                  5

6                                                                  6
$ sdiff file1 file3

1                                                                  1

2                                                                  2

3                                                                  3

4                                                                  4

5                                                                  5

6                                                                  6
With sdiff everything will be displayed – the matching parts too – the first file on the left and the second file on the right side of your terminal. In case of a difference there will be a vertical line in the output. If you want to display only the distinct lines then you should pipe through a grep this, but bevare to escape the pipe character!

$ sdiff file1 file2|grep ‘ \|’

2                                                               |  two

3                                                               |  three

There is a third tool called cmp, specifically for scripts. It’s strengthness is its speed: it only checks if the two files are differing or not, and after the first difference it exits to terminal. It is very useful if you compare large files and are just curious if the files differ or not, and the differences doesn’t count.

$ cmp file1 file2

file1 file2 differ: char 3, line 2
cmp file1 file3

Note: There is a -h switch for diff, which is for fast checking, though it processes the whole file too. According to the man page: “Performs an alternate comparison that may be faster if the changed sections are short and well separated. The -h flag works on files of any length.For more info on the abovementioned commands see it’s corresponding man pages.

Lastly,  how do we compare a local file with a remote file? The answer is using ssh and diff/sdiff commands.

$ ssh -l $username $Remote_HOST cat /REMOTE/FILE | sdiff /LOCAL/FILE -

1                                                                  1

2                                                               |  two

3                                                               |  three

4                                                                  4

5                                                                  5

6                                                                  6