paulgorman.org/technical

Unix Find

(July 2017)

The find utility has a somewhat arcane syntax, but it’s very useful.

The syntax varies slightly by OS.

Let’s break down the parts of a find command, ignoring for the moment the option flags:

$ find    path(s)              expression
$ find    ~/                   -iname '*cthulhu*'
$ find    ~/                   -type d -iname '*warhammer*'
$ find    ~/ /usr/share/doc    -type f -iname '*virt*'    2>&1 | grep -v "Permission denied"

GNU find defines expressions as composed of:

OpenBSD defines an expression as composed of “primaries” and “operators”. The default expression (if the user provides none) is the primary -print. The default -print is implicitly included in the expression does not include one of these alternative primaries: -exec, -ls, -ok, -print0.

Use -execdir instead of -exec where possible. -execdir runs the command from inside the directory of the matched file. It’s safer.

The -ok primary, for example, acts like -exec, except it prompts for a y/n answer:

$ find ~/ -type f -iname '*virt*' -ok ls -l {} \;
< ls ... /home/paulgorman/Documents/2015/april/virt-install.txt > ? y
-rw-r--r-- 1 paulgorman paulgorman 423 Apr 14  2015 /home/paulgorman/Documents/2014/june/virt-install.txt

Operators combine or modify primaries. In order of descending precedence:

Note that any two expressions together evaluate with an implicit -and. The second expression is not evaluated if the first is false.

With -or, the second expression is not evaluated if the first is true.

Remember to escape operator parens in the shell:

$ find . \( -name \*.jpg -o -name \*.gif \) -exec rm {} \;

For pattern matching, see GLOB(7):

GNU find sometimes supports additional types of matching, which can be found with:

$ find -regextype help

Most of the metadata find gets comes from stat’ing file inodes. See INODE(7) for descriptions of the inode metadata.

For the sake of efficiency, in cases where it doesn’t affect the expression logic, test file names before testing any inode information, to save find from having to stat non-matching files.

Examples

Find files with contents modified in the last five minutes:

$ find ~/ -type f -mmin -5

Find files owned by the “adm” group:

$ find /var/log -type f -group adm

Print a long listing like ls -li for found files:

$ find ~/ -iname '*rc' -ls

Find files not named as pdf or html:

$ find ~/Books -type f ! -name *.pdf ! -name *.html

Find empty directories:

$ find /etc -type d -empty

Change file permissions:

$ find ~/etc -type f -name '*.conf' -ls -exec chmod 0600 {} \;

Print the first line of txt files:

$ find ~/Books -type f -name '*.txt' -exec head -1 {} \;

Find text files that contain “192.168.1”:

$ find ~/Books -type f -name '*.txt' -exec grep '192.168.1' {} \;

Copy txt files:

$ find ~/Books -type f -name '*.txt' -exec cp {} /tmp/ \;

Edit matching files with sed:

$ find /tmp -type f -name '*.txt' -exec sed -i'' -e 's/192\.168/10\.0/gp' {} \;

Find the percentage of files that occupy no more than one block:

$ echo "$(find ~/ -type f -size 1 | wc -l) / $(find ~/ -type f | wc -l) * 100.0" | bc -l

Find the percentage of files larger than 500M:

$ echo "$(find ~/ -type f -size +500M | wc -l) / $(find ~/ -type f | wc -l) * 100.0" | bc -l

Find files but DON’T include ~/backup.tar.xz and ~/Books (and see notes below):

$ find /etc /root /home -path ~/backup.tar.xz -prune -or -path ~/Books -prune -or -print

A few things about the above example:

  1. Remember that expressions are evaluated left to right, and that find implicitly connects expressions with -and’s.
  2. -prune really means “don’t descend into this”, and always evaluates as “true”.
  3. Therefore, unless we follow it with an alternative -or -print, find simply prints the name of the pruned file.
  4. Practically, something like an explicit -or -print expression almost always follows -prune.

What if we only wanted files, not directories too?

$ find /etc /root /home -path ~/backup.tar.xz -prune -or -path ~/Books -prune -or -type f -print

The -type f and -print are joined by an implicit -and operator. Therefore, because find evaluates the expression left-to-right, the -type f must preceed the -print. If written -or -print -type f, the -type test would never execute because -print always returns true.

Clean out unwanted files matching multiple name patterns:

$ find /var/virusmails/ \( -name 'spam-*.gz' -o -name 'badh-*' -o -name 'banned-*' \) -type f -mtime +"$days" -execdir rm {} \;