Blog Archives

Make a incremental backup

This is to introduce how to backup your data the rsync-time-backup script. It is a good idea to backup your valuable data. It better include snapshots so that you can find your file that you accidentally deleted, say, three weeks ago. It should also be incremental so you do not occupy extra space for the same file.

How-to

Setting up the SSH key agent

If you are backing up from/to a remote machine, you will need to login multiple times. It’s better to set up the SSH key agent before you start backing up.

Install the script

The backup program is a shell script. You should be able to install it in you user directory on any of the HPC servers. You can get it from the Github repo or just download the script file itself. 

To install it, put it into some directory like [shell]~/.local/bin/[/shell] and make sure the directory is in your [raw]PATH[/raw] environmental variable.

Do the backup

Once you install the script. You can make backups like this.
[raw] rsync_tmbackup.sh backup_target backup_location
[/raw] The backup target and location can be both local directories like [raw]/home/yunqi/work[/raw] or remote ones like [raw]yunqi@brosnan:~/work_backup[/raw]

The first time you run the backup, you will get something like
[raw] Safety check failed – the destination does not appear to be a backup folder or drive (marker file not found).
[/raw] It should be fine, just run the command it gives which creates the backup folder, and run the backup script again.

What you’ll get

if you look into your backup folder, you’ll probably see something like this 

[shell] total 8
drwxr-xr-x 16 yunqi teoroo 4096 Oct 20 10:16 2018-11-09-152638
drwxr-xr-x 16 yunqi teoroo 4096 Oct 20 10:16 2018-11-09-161817
-rw-r–r– 1 yunqi teoroo 0 Nov 9 15:26 backup.marker
lrwxrwxrwx 1 yunqi teoroo 17 Nov 9 16:18 latest -> 2018-11-09-161817
[/shell]

A sub-folder is created every time you backup, which is a snapshot of your target. You can delete any snapshot, other snapshots will not be affected.

Tip(s)

You may want to backup regularly,  so it’s probably a good idea to add some alias like this.

[shell] # Backup my work directory to Teoroo2
alias back_work=’rsync_tmbackup.sh ~/work/ yunqi@teoroo2.kemi.uu.se:/home/yunqi/backups/work_at_rackham’
[/shell]

The mechanism

Hard link

rsync-time-backup uses hard link for the same file at different snapshots. It is safe to delete one snapshot, and the others shall not be affected. A file is stored on the disk until all the snapshots including it were deleted. 

https://en.wikipedia.org/wiki/Hard_link

Drawbacks

Large files

The backup script is not currently smart enough to notice that you have moved your file. So it will create multiple copies of the same file when you move files or just rename your folders. Watch out your disk space occupancy if you are constantly moving your big files (MD trajectories, charge density files) around.

Remember to specify your username

The backup script is (again) not currently smart enough to know that your can have your user name in you ssh config file. In order for the script to understand a remote address, you can not omit your user name. 

SSH key agent

Setting up a SSH key agent enables you to connect to the remote machines safely without typing your password over and over.

Create SSH key

SSH key is a pair of file that shows your identity. It comprises of a private key (think of it as the key) and a public key (think of it as the keyhole). You can put  the public key on the ssh server and access the server with your private key. It is recommended to protect your private key with a password, otherwise if someone copies your private key, he will be able to access all your remote machines. 

Generate key pair on Unix system

Type [shell]ssh-keygen[/shell] and follow the instructions. By default the generated key pair will locate at [shell]~/.ssh/[/shell].

After that, you can add your public key to the [shell]~/.ssh/authorized_keys[/shell] on the remote machine. There is also a shortcut for this: [shell]ssh-copy-id yunqi@teoroo.kemi.uu.se[/shell].

Windows with putty

Putty comes with a tool called puttygen.exe to generate the key pair. The private key is stored in a .ppk file, and the public key is shown in the interface.

You can then copy the public key to  [shell]~/.ssh/authorized_keys[/shell] on the remote machine.

SSH Key agent

With encrypted SSH, you have to enter the password whenever you use your private key. [raw]ssh-agent[/raw] tries to ease this by encrypting your private key once, and keep the key until you logout.

Unix with ssh-agent

[shell]eval $(ssh-agent -s)[/shell] starts the key agent, and you can then [shell]ssh-add somekey[/shell] to add your keys. If you do not specify the key file to add, it adds the default key file in [raw]~/.ssh[/raw]

You can add the two lines to [raw]~/.bashrc[/raw], but to avoid entering the password everytime you open a terminal, you can add the ssh-agent to you desktop environment startup script, or use keychain.

Unix with keychain

Keychain looks for existing ssh-agent sessions and use it if one exists. It “allows you to easily have one long running ssh-agent process per system, rather than the norm of one ssh-agent per login session”.

To use it, install key chain on you computer, and add [raw]eval `keychain –eval –agents ssh id_rsa`[/raw] to your [raw]~/.bash_profile[/raw] or  [raw]~/.bashrc[/raw]

Windows with pageant

The ssh-agent equivalent for putty is the pageant. You can open your private key files with the pageant and start you ssh sessions from there.

Agent forwarding

When you get your ssh key, chances are that you would like to jump from server to server or transfer files between servers. SSH has a convenient feature to forward your key when to 

Unix

On Unix, ssh has a “-A” option to forward your ssh-agent when you travel across the servers. For example, [shell]ssh -A yunqi@teoroo.kemi.uu.se[/shell] forwards you key agent to teoroo, and you can use your key without re-entering your password during the  ssh session.

You can also create a config file in you home directory to specify 

[shell] Host TEOROO
   forwardAgent yes
   HostName teoroo.kemi.uu.se
   Port 22
   user yunqi
[/shell]

After that, you can just type [shell]ssh TEOROO[/shell] and get you key agent forwarded automatically. Note that you should only forward the agent to trusted servers, since even though you do not store your key on the remote machine, the system admin can still use you forwarded key when you login.

Windows with putty

You can also enable the agent forward for putty in [raw]Connection->SSH->Auth->Authentication parameters[/raw]