Unix Power Tools
by Dale Dougherty
Checking Differences with diff
The diff command displays different versions of lines that are found when comparing two files. (There's also a GNU version on the CD-ROM.) It prints a message that uses ed-like notation (a for append, c for change, and d for delete) to describe how a set of lines has changed. This is followed by the lines themselves. The < character precedes lines from the first file and > precedes lines from the second file.
Let's create an example to explain the output produced by diff. Look at the contents of three sample files:
When you run diff on test1 and test2, the following output is produced:
The diff command displays the only line that differs between the two files. To understand the report, remember that diff is prescriptive, describing what changes need to be made to the first file to make it the same as the second file. This report specifies that only the third line is affected, exchanging walnuts for grapes. This is more apparent if you use the - e option, which produces an editing script that can be submitted to ed, the UNIX line editor. (You must redirect standard output to capture this script in a file.)
This script, if run on test1, will bring test1 into agreement with test2.
If you compare the first and third files, you find more differences:
To make test1 the same as test3, you'd have to delete the first line (apples) and append the third line from test3 after the third line in test1. Again, this can be seen more clearly in the editing script produced by the - e option. Notice that the script specifies editing lines in reverse order; otherwise, changing the first line would alter all succeeding line numbers.
So what's this good for? Here's one example.
When working on a document, it is not an uncommon practice to make a copy of a file and edit the copy rather than the original. This might be done, for example, if someone other than the writer is inputing edits from a written copy. The diff command can be used to compare the two versions of a document. A writer could use it to proof an edited copy against the original.
Using diff in this manner is a simple way for a writer to examine changes without reading the entire document. By redirecting diff output to a file, you can keep a record of changes made to any document. In fact, just that technique is used by SCCS and RCS to manage multiple revisions of source code and documents.