A backup strategy is important. It’s like an insurance - you have it but you hope, you won’t need it. A backup not only helps in case your hardware is broken, it might also help, if you’ve accidently deleted some important files. One simple step towards a backup is using a external hard drive, that you can attach to your computer and then copy the data you want to backup. However, if you don’t take care, you either have always only one version of your backup, which means you can’t restore deleted files. Or you end up in having multiple copies of the same files which occupy the available space very quick. And, of course, copying all the files always again will take lots of time.

This is where rsync comes in handy. This is a file synchronization tool, that can efficiently copy files from one place to another. It doesn’t matter, whether this is remote or local or any mixture. It uses a efficient transfer protocol to reuse any data, that is already existing on the target and copy only the differences. Another important feature is, that it supports hard links which allows for creating a multiple backup file trees without taking additional space.

Rsync is easy to install under a Linux, because the distribution usually contains already a pre-packaged version of it. Under Windows, it’s a bit more complex. But there are also solutions, like Cygwin, which provides a POSIX layer for Windows and provides also pre-compiled binaries. Installing Cygwin in the default way usually means to install it for each Windows computer manually. And you would not be able to run the backup script, if Cygwin is not installed.

In order to avoid this, my solution takes the minimal required files from Cygwin with rsync and stores these together with a simple batch file on the backup drive, so that everything required is available.

Let’s start to create such a backup environment by downloading first the necessary Cygwin packages:

  1. Select a mirror from https://cygwin.com/mirrors.html. I’ve chosen ftp-stud.hs-esslingen.de.
  2. Go down to x86/release/rsync and download rsync-3.1.2-1.tar.xz (or any newer version that is available).
  3. Have a look at the file setup.hint. It tells, that rsync requires also the two packages libiconv2 and cygwin.
  4. So, download these packages, too: x86/release/libiconv/libiconv2/libiconv2-1.14-3.tar.xz, x86/release/cygwin/cygwin-2.5.1-1.tar.xz. Please note, that cygwin requires base-cygwin: x86/release/base-cygwin/base-cygwin-3.8-1.tar.xz. It turns out, that libiconv2 has a dependency on libintl8. It’s a bit difficult to find this dependency, it’s part of gettext. So we also need x86/release/gettext/libintl8/libintl8-0.19.7-1.tar.xz. If you later try to run rsync.exe, you’ll notice that another DLL is missing: libgcc1a.dll. This is part of libgcc1: x86/release/gcc/libgcc1/libgcc1-5.3.0-5.tar.xz. Additionally cygpopt-0.dll is missing: x86/release/popt/popt-1.16-1.tar.xz.
  5. The backup script will use additionally the tool tee which is from the package coreutils: x86/release/coreutils/coreutils-8.25-3.tar.xz.
  6. Extract each file with tar xvJf package.tar.gz. Please note the J flag, which uses the xz-compression.
  7. Now it’s time to “cherry-pick” the needed executables and libraries. The files are all in the subdirectory usr/bin. And we need the following files:
    • rsync.exe
    • tee.exe
    • cygiconv-2.dll
    • cygwin1.dll
    • cygintl-8.dll
    • cyggcc_s-1.dll
    • cygpopt-0.dll

The steps above can be simply executed by the following script - you’ll get the resulting files in the subdirectory named rsync-win:

BASE_MIRROR_URL=http://ftp-stud.hs-esslingen.de/pub/Mirrors/sources.redhat.com/cygwin
wget $BASE_MIRROR_URL/x86/release/rsync/rsync-3.1.2-1.tar.xz
wget $BASE_MIRROR_URL/x86/release/libiconv/libiconv2/libiconv2-1.14-3.tar.xz
wget $BASE_MIRROR_URL/x86/release/cygwin/cygwin-2.5.1-1.tar.xz
wget $BASE_MIRROR_URL/x86/release/base-cygwin/base-cygwin-3.8-1.tar.xz
wget $BASE_MIRROR_URL/x86/release/coreutils/coreutils-8.25-3.tar.xz
wget $BASE_MIRROR_URL/x86/release/gettext/libintl8/libintl8-0.19.7-1.tar.xz
wget $BASE_MIRROR_URL/x86/release/gcc/libgcc1/libgcc1-5.3.0-5.tar.xz
wget $BASE_MIRROR_URL/x86/release/popt/popt-1.16-1.tar.xz
for i in *.tar.xz; do tar xfvJ $i; done
mkdir rsync-win
cp -v usr/bin/rsync.exe usr/bin/tee.exe usr/bin/cygiconv-2.dll usr/bin/cygwin1.dll usr/bin/cygintl-8.dll usr/bin/cyggcc_s-1.dll usr/bin/cygpopt-0.dll rsync-win

Now, we need the last part, the batch file which serves as the backup script:

@echo off

REM --------------------------------------
SET OLD=2016-05-04
SET NEW=2016-05-05
SET BACKUP_DRIVE=f
SET BACKUP_DEST=JonPC
REM --------------------------------------

TITLE Backup
COLOR 2F

echo Backing up: OLD=%OLD% NEW=%NEW%
echo   BACKUP_DRIVE=%BACKUP_DRIVE%
echo   BACKUP_DEST=%BACKUP_DEST%
echo.
pause

SET RPATH=%BACKUP_DRIVE%:\rsync-win
SET START=%date% %time%

%RPATH%\rsync.exe -rltvhP --no-i-r ^
   --link-dest="../%OLD%/" ^
   --exclude "Cache" ^
   --exclude "cache2" ^
   --exclude "cache" ^
   --exclude "parent.lock" ^
   --exclude "Temp*" ^
   --exclude "thumbcache_*.db" ^
   ^
   "/cygdrive/c/Users/Jon" ^
   "/cygdrive/d/Data" ^
   ^
   "/cygdrive/%BACKUP_DRIVE%/%BACKUP_DEST%/%NEW%/" ^
   ^
   2> %BACKUP_DRIVE%:\%BACKUP_DEST%\logs\%NEW%.err | %RPATH%\tee.exe %BACKUP_DRIVE%:\%BACKUP_DEST%\logs\%NEW%.log

SET END=%date% %time%

echo Started: %START% | %RPATH%\tee.exe -a %BACKUP_DRIVE%:\%BACKUP_DEST%\logs\%NEW%.log
echo Ended: %END%     | %RPATH%\tee.exe -a %BACKUP_DRIVE%:\%BACKUP_DEST%\logs\%NEW%.log

pause

So, this script makes some assumptions:

  • The external hard drive got the letter f assigned. This might change, so before you start the script you might need to adjust this manually.
  • The backups will be stored under the path JonPC on the backup drive. You can adjust this as needed.
  • The variables OLD and NEW are used as subfolder names, to where the files are actually copied. You’ll have to adjust these variables for every run!. If files with the same name and content already exist in the “OLD” folder, then rsync won’t copy the file again, but just hard-links from NEW to OLD. This is what the option --link-dest will enable.
  • You can configure some exclude pattern, like “Cache”.
  • This script backs up the folder /cygdrive/c/Users/Jon and /cygdrive/d/Data. You can add more folders.
  • It will write a log file to the “logs” subfolder. This folder must exist, so just create it manually once. It’s a subfolder under “BACKUP_DEST” - in the example, the full path is: F:\JonPC\logs.
  • The executables “rsync.exe” and “tee.exe” are expected to be in path “RPATH” - on the example it’s F:\rsync-win. You’ll need to copy the files once manually to the backup drive.
  • For the very first run, there is no “old” backup folder to point to. Just use some date, rsync will still work. It just won’t create any hard-links. The second run, make sure you point the “old” variable to the already existing backup and take a new date for the “new” variable.
  • The backup drive must use the NTFS file system - as only this supports hard-links. If you use FAT32, the script will still work, but you won’t benefit from the space-efficient hard-linking solution.
  • Depending on which folders you want to backup, it might be necessary to run the batch file with Administrator permissions.

Restoring files is simple. You don’t need any special tool, just plugin the backup drive and use the Windows explorer to find the file and copy it.

By the way, rsync is also in use to simulate a tape library backup solution for linux servers. This script is called dirvish.