Today I’d like to show you the basic usage of rsync – a wonderful, old and reliable tool for incremental data transfers and synchronization of local directories or even data between different Unix systems.
rsync is quite a complicated command, so don’t expect this first post to explain everything and cover every possibility. Like I said, this is only the beginning.
rsync (stands for remote synchronization) is an open source tool for data transfers between Unix systems.
In simplest form, it’s just a Unix command you run locally to synchronize two directories. But the real power of rsync is when you need to synchronize directories between remote systems. rsync relies on ssh protocol for transferring the data between Unix systems, but earlier versions used rsh. Advanced deployments imply using rsync server in addition to simply running the command – this is basically the same command but running in a stand-by daemon mode.
rsync can easily be found or installed in any modern Unix-like OS, but it’s always best to check the official website for latest developments around this tool: rsync website.
rsync synchronizes directories – makes one directory look (contain the same files and subdirectories) exactly like another one. rsync works by getting a list of files in your source and destination directories, comparing them as per specified criteria (file size, creation/modification date or checksum) and then making the destination directory reflect all the changes which happened to the source since the last synchronization session.
Just to show you how it works, I’m going to create two directories with a few files in them. /tmp/dir1 in my examples will be a source directory (original dataset), while /tmp/dir2 will be a destination directory – to be made the same as /tmp/dir1 as the result of running rsync.
So that’s how I set up directories and files:
That’s how our directories and files look now, so dir2 contains a copy of file1:
Now it’s time to run your first ever rsync. There’s two ways of specifying options for the command, a full option name starting with — and usually having a meaningful name, or a short option name – starting with – and having short meaningless names (usually one-letter ones) for each option.
The last two parameters in an rsync command line should be the source and the destination directories.
In this example below, we’re using the following options:
-avz – a for archive mode (preserve all the attributes of each file and directory – ownership, permissions, etc), v for verbose mode (report a list of files processed by rsync) and z for data compression to speed transfers up.
–stats – this option shows a summary at the end of rsync’ing process to highlight the main stats of the job
Stats are self-explanatory, and you can see that although there were 4 files found in source directory /tmp/dir1, only 2 files were transferred into /tmp/dir2 because /tmp/dir2 already had one of the files.
That’s all I have for you today, in the next post on rsync I’ll show you some more advanced uses of this command. For the time being, read man rsync or even rsync –help on your system to get an idea of how really powerful this tool is.
Until next time – good luck with your Unix experiments!