Intro
This topic has been covered more than enough on the internet, so this goes a bit against my mission statement of ‘original articles.’ However, after a recent experience of being put on the spot to find some files on the Linux file system using various parameters, without having my beloved man pages, or being in front of a console–I have realized that I need a review. I have decided it won’t hurt to share my review with others.
Update: I have written a GUI in python that generates find commands called pyGnomeFind that one might find helpful.
Generate a Testing Directory with some random files
Some of these examples will actually make changes to files, I wanted a bunch of differently name files, that also hand commonalities. So I came up with:
for i in $(grep allow /etc/dictionaries-common/words | grep -v “‘”); do touch $i; done
What this does is search my dictionary (just a list of words, one per line) for any line that contains the string “allow”, but not an apostrophe. The command then creates an empty file for each match, and then each match has a filename of the matched string. On my system this creates 50 files. The techniques used are as follows: For Loop, Command Substitution (See previous Post, this time I used $( ) instead of back ticks), and Piping. The two commands used are grep which is for searching documents, and touch which creates an empty file with the given name (i.e. touch test creates a file called test).
The locate Command
This is the command I used most frequently in the past, it is very simple to use. It is database driven (meaning it search a database of files, not the file system itself) which makes it fast. The basic syntax is: locate filename This will find all files in the system either named “filename” or that contain the string “filename”. I don’t recommend using this command until you can use the find command in your sleep, or else you might limit yourself, and not remember the find syntax when you need to.
The find Command
The find command is far more powerful, and can find files based on such things as: filename, text contained with the file, modified time, accessed time, permissions, and file type. You can them perform all sorts of operations on the files found. Combined with regular expressions, this flexibility and power is a fine example of the power of the command line interface.
Abstract Usage (and some Vocabulary): find [path...] [expression]
Path is synonymous for directory (the path you take to the file). The default is the current directory, but I usually put a period, which represents the current directory to bash. The expression will consist of options, tests, and actions. Tests are essentially criteria, for example, time modified. The default action is to print what find found, but commands can be applied to what is found as well. The Expression can also have operators. Operators are things such as And, Or, Not–if no operators are specified the default is AND.
Usage by Examples:
- Filename: find / -name “*.jpg”
- The asterisk is a filename matching character which means zero or more of any character(s). This command will find all files with the .jpg extension (but not .JPG) in the root path and all its subdirectories. This is a different behavior than the asterisk metacharacter which is an element of regular expressions. This is worth explaining, there is shell globing and regular expressions. Some programs in GNU/Linux use regular expressions and some use shell globing (such as the find utility). The BASH interpreter itself uses globing, so if you don’t quote an asterisk the interpreter will send all the files in the current directory that match the pattern and send them to the program, not the *.jpg string itself (unless there are no .jpg files in the current directory, then BASH sends the string itself). So in this case find sees *.jpg not whatever .jpg files happen to be in the current dir. It is easy to make mistakes when writing scripts since utilities such as grep and sed use regular expressions.
- Case insensitive filename: find / -iname “*.jpg”
- Modified Time: find / -mtime 0
- Find files modified in the past 24 hours. The time switch (aka flag or option) works in 24 hour blocks moving backwards (since the computer doesn’t know if files will be modified in the future, yet), so 0 means 24 hours before now, 1 means 24-48, 2 means 48-72. You can use + to mean more than the block, or – to mean less than the block of time. More than starts at the upper limit of the block, and less then starts at the lower limit of the block. So 0 and -1 mean the same thing. Substitute -mmin for minutes.
- Access Time: find / -atime 0
- Same as above but access time, read above for explanation.
- Size: find / -size +1024k
- find all files more than 1M. Similar behavior to time, see above.
- Permissions: find / -perm +o=x
- Find all files that are executables by all, can use symbolic or octal modes for permissions, see chmod documentation for a further explanation of this.
- Using operators: find / -iname “*.jpg” ! -type 1
- find all .jpg files that are not symbolic links. ! is the NOT operator. type 1 are symbolic links.
Some Common Options for find:
-xdev: Don’t go to other filesystems
-maxdepth #: How many directories in depth, 1 = current directory only.
-prune: used by find / -path ‘/dev’ -prune o
The xargs Commmand: Combine Arguments
This is used in combination with find for efficiency, it is used to perform actions on the files that find returns, but instead of running the command each time for each file, it passes all the files to the command at once so the command only has to be run once. The common method is to pipe the results of find to xargs. Be careful, not all commands handle the input of of xargs by default properly.
Example 1: find ~/ -maxdepth 1 -name “*” | xargs grep -i “test”
This will search all files in your home directory and no subdirectories (maxdepth) for the case insensitive string “test”. Note: grep uses regular expressions so an asterisk here will have different behavior.
Example 2: find ~/ -maxdepth 3 -name “*.back” | xargs rm
This will find all files in the home directory out to 2 sub directories, with the extension of .back, and delete them.
That’s my thousand words on finding files.