How to Find Differences between Text File With “Diff” in Linux
Writers and programmers can come across situations that require them to find the difference between two files or different versions of the same file, which is a hard task if done manually. The problem calls for a solution provided by Linux command-line utility known as diff, which performs fine-grain level file comparison.
There are several tools available for file comparison in Linux, but the diff utility is by far the simplest and most compatible command-line utility. Hence, the article focuses on various modes of diff utility and its wrappers.
Diff Command
It is a command-line tool that compares two files and prints the line-by-line output on the terminal. It is easy to use and also compatible with multiple operating systems.
The diff command displays output in the normal, context, or unified format in prescriptive context to make the changes in the first file to match with the second one. We will discuss each one of them as follows:
Normal Format
the following diff command is the simplest and displays content in the normal format as it does not contain any diff command options:
ubuntu@ubuntu:~$ diff file1.txt file2.txt
Make sure to write the correct file name and format. The files contain the following content in the first and second text files.
File 1:
File 2:
Diff command and output:
The above output includes some alphabetical and numerical characters to describe the difference where 1c1, 2a3 represent change commands The alphabets in the change command have the following meaning:
a: add to the line
c: change the line
d: delete the line
The numbers in the output represent line numbers of the 1st and 2nd files, respectively and the lines preceded by < and > represent lines from the first and second lines.. Looking at the above output, 1c1 means that line 1 of File 1 needs to be changed according to line 1 of File 2, to make the two files identical. Similarly, 2a3 denotes that line 3 of File 2 needs to be added after line 2 of File 1 to make these files identical.
Let’s take a look at another example:
File 1:
File 2:
The output denotes that line 4 of File 1 needs to be deleted to match the contents of File 2.
Context Format
The context format uses the -c option to display the difference of the files with the help of context around them.
The output from the context format initially displays the file names and modification dates and times of the file. Whereas the asterisks *** and — — represent ‘from’ file to ‘to’ file while ***1,4*** and — 1,4 — — represent a range of lines in both files:
The difference between the lines is represented with the help of characters, such as:
- !: indicates changed lines between two files
- +: represents a line in the second file that must be in the file1.txt.
- -: the line in the first file that needs to be deleted to match the second file or the line missing in file2.txt.
Lastly, the lines with two spaces instead of special characters represent similar lines in both files.
Unified Context(-u)
The unified format summarizes the context mode output by removing redundancy. The output for the files in the above example for unified mode is as follows:
Colordiff:
It is a wrapper for diff with the only difference of output with color and syntax highlighting, thus providing better readability.
It is available for most Linux operating systems and also has customizable color schemes. It is not broadly portable to systems except Open BSD and Linux. First, install the utility and use the following command to view the file difference:
ubuntu@ubuntu:~$ sudo apt-get install colordiffubuntu@ubuntu:~$ colordiff file1.txt file2.txt
The above colordiff command output is similar to the diff command normal mode. Such that the only difference lies in the syntax and color scheme. The output displays the change command in the blue colored text while the red lines represent the text written in the first file, and the green line represents the text in the second file.
Wdiff:
Wdiff is another wrapper for diff that works by extracting each word from the files and writing these words one per line in two new temporary files. After that, it functions as diff by comparing each word in these two new temporary files and produces the output. It is free, under the General Public License, and is available in many languages. Install the utility and use the following command to compare two files:
ubuntu@ubuntu:~$ sudo apt-get install wdiffubuntu@ubuntu:~$ wdiff file1.txt file2.txt
The above output, for instance, displays the price (15.99) for the Levis shirt in the first file to be replaced with the price (25.44) of the Levis shirt in the second file. Similarly for the American Eagle T-shirt, The first file’s price i.e 30.00 needs to be replaced with 40.00 to make the two files identical.
Conclusion
A better understanding of the command-line usage for file comparison can save a lot of time for beginner Linux users. The article demonstrates the use of the diff command and its wrappers to facilitate users in file comparison tasks. It further explains various diff command modes and their differences for better understanding and usage.