How to Selectively Copy Files with Rsync
Posted: 2021-10-03
The utility rsync is fantastic for efficient, incremental backups of files. It can be used to copy files locally or to back files up to a remote machine over SSH.
One aspect of rsync
that I always struggle with, however, is the syntax for selectively syncing a subset of files. It does not use the normal wildcard expansion that other shell commands use, and it is particularly persnickety about how you order your includes and excludes.
Full backup
By default, it will copy the entire contents of the src
directory to the dest
location:
rsync -avz src/ dest/
This will include all files and subdirectories, including hidden ones like .git/
, which you may not want to copy.
Excluding a subset of files
To exclude files or directories, you use the --exclude
flag. You can exclude specific files or directories or files/directories that match a particular pattern. This will exclude the.git
directory and any file that ends with yml
:
rsync -avz --exclude='.git/' --exclude='*.yml' src/ dest/
Including a subset of files
However, to only copy files that match a particular pattern is more complicated. As seen above, rsync
includes all files and folders by default. In order to only include files of a certain type, we need to exclude all the other files but not all directories.
We need to tell rsync
explicitly that we want to _include_ all directories because excluding *
would exclude directories as well as files. The following command will copy all *.txt
files to the destination as desired:
rsync -avz --include='*/' --include='*.txt' --exclude='*' src/ dest/
We need all of those pieces, with the includes first and the exclude second:
- Include all directories
- Include all
*.txt
files - Exclude everything else
Things that seem like they should work, but don't
The first thing you might try is to just include the *.txt
files, but this will copy all files becausersync
includes all files by default:
rsync -avz --include='*.txt' --exclude='*' src/ dest/
If you the exclude all other files but don't include all directories (as in the following example), you will only copy the *.txt
files that are in the root of your src
directory, but not in any subdirectories. This is because excluding *
excludes all files and directories.
rsync -avz --include='*.txt' --exclude='*' src/ dest/
And if you have all the proper includes put the exclude first, you will not copy any files because the order matters. Here, you are excluding all files and directories right off the bat, and no later includes will change that:
rsync -avz --exclude='*' --include='*/' --include='*.txt' src/ dest/