comm

less than 1 minute read

Updated:

comm is a unix tool that is useful for doing line diffs between two sorted files. It let’s you view lines that are unique to file1, lines unique to file2, or lines that are in common (set intersection).

It has an invariant check that will warn when the files are not in sorted order. You’re going to want to use this to compare sets, without whipping out a python or node repl. With python:

  1. Read each file
  2. Produce two sets file.readlines()
  3. Perform set operations
  • set1 - set2 to get lines unique to set1
  • set2 - set1 to get lines unique to set2
  • set1 & set2 to get lines that exist in both
  • set1 ^ set2 to get lines are not in common

Usage

# Only lines in common
comm -12 1.txt 2.txt

# Lines only found in 1.txt
comm -23 1.txt 2.txt

# Lines only found in 2.txt
comm -13 1.txt 2.txt