Luis Pereira

Intro

This post provides a guide to backup and restore a Linux system, that can be archived by several methods. Here are provided 3 methods, all Linux native, or needing a simple package install.

Full disk/partition clone

The easiest method to perform a system backup is with a full partition or disk clone, which can be done with the command dd. The Arch Wiki has some great examples on how to use of this command, being the main ones presented below.

#!bash
dd if=/dev/in of=/dev/out bs=64k conv=noerror,sync status=progress

The example above copies all data blocks from the if argument to the of. Both these arguments could be a file, a partition (for example /dev/sda1, or a full disk (for example /dev/sda). The bs field defines the size of the block used to copy, which should match the disk cache to optimal copy speed. The conv field specifies that should continue the copy if encounters any error (argument noerror), and that should keep the block offset, filling errors with 0. This means that a block size should be chosen to avoid loss of a great data information in case of an error. The status=progress argument, tells dd to print a status of how much data was copied and which is the operation completion percentage.

An example of a full disk clone (between two drives) is as follows:

#!bash
dd if=/dev/sda of=/dev/sdb bs=64k conv=noerror,sync status=progress

The copy will stop if the of argument gets filled, so it should be assured that the drives are the same size.

IMPORTANT NOT: After running this command, there is no confirmation screen. This means, be sure to provide the correct input (if) and output (of) arguments, to not delete all data by mistake.

Full disk/partition image file backup

The previous example shows how to clone disk/partition to another drive, but it could be the case that the system does not have a second drive to perform the clone. With the dd tool is possible to make a clone to a image file, so it could be saved in another location, and restored when needed. To perform this type of backup, in addiction to the dd tool, it is necessary to use the gzip tool to create the image. The full command is:

#!bash
dd if=/dev/sda conv=sync,noerror bs=64K status=progress | gzip -c  > /path/to/backup.img.gz

This command will read a block from the dev/sda drive and pass it to the gzip tool that creates the .img.gz file.

#!bash
dd if=/dev/sda conv=sync,noerror bs=64K | gzip -c | split -a3 -b2G - /path/to/backup.img.gz

The above command is the same as the previous, but split the compressed file into smaller files, useful for use with, for example, fat32 file systems that have the file size limit. The arguments are: -a3 specifies the files suffix to have a length of 3; -b2G specifies the maximum file size to 2Gb.

To have full information of the drive geometry of the backup, save it to a file with the command:

#!bash
fdisk -l /dev/sda > /path/to/list_fdisk.info

Restore from image file backup

When time to restore a image file backup, created by the steps described in the previous section, arrives, this can be archived by the command:

#!bash
gunzip -c /path/to/backup.img.gz | dd of=/dev/sda

If the image file backup was created with the split files variation, then the command should be:

#!bash
cat /path/to/backup.img.gz* | gunzip -c | dd of=/dev/sda

Dealing with partition UUID after clone

With the operations of this dd command, a full clone of the data will be made, which means the disks will be equal. The Linux system mounts its filesystem points usually by UUID of the disks. This brings a problem if both drives are always connected on the system, which is that all the cloned partitions will have the same UUID, and the Linux system will mount whatever it detects first on the boot sequence. If both disks must be connected to the system, it will be better to reset the UUIDs of the cloned ones.

Running the following command, will output the UUIDs for each partition on the system:

#!bash
blkid|grep UUID

To change a UUID of a partition, it needs to be unmounted first, so, run the commands:

#!bash
umount /dev/sdb1
tune2fs /dev/sdb1 -U random

Then, the command blkid could be run again to check for the UUID change. This changes the partition UUID, but now the system will not boot up from that drive, because it was cloned and is pointing to the previous UUID. This is fixed by changing the UUID pointed by the /etc/fstab of that system partition. Just open the file and change the value to the new one (or ones if there are multiple partitions in use).

It's possible that this is not enough, sometimes after, this changes, the system can still not boot, because the bootloader is still pointing to another UUID. This is a little more complex to solve, but it is still possible using the chroot tool, with the steps bellow: 1. Boot into a Linux system. Could be the original one that was cloned or a live distro Linux. 2. Mount the filesystem that changed the UUID, as example:

#!bash
mkdir -p /mnt/system
mount /dev/sdb1 /mnt/system

Be aware that if there's a different partition to the boot folder (for example in UEFI systems), it should also be mounted

#!bash
mkdir -p /mnt/system/boot
mount /dev/sdb2 /mnt/system/boot

For bootloader generation it should be needed the system components mounted on the system boot, that are not on the mounted system. To mount them, run:

#!bash
mount --bind /dev /mnt/system/dev
mount --bind /proc /mnt/system/proc
mount --bind /sys /mnt/system/sys

#!bash
chroot /mnt/system/

Now that you are on the mounted system, regenerate the bootloader, with the following commands:

#!bash
grub-mkconfig
update-grub

This should be enough to have the same system cloned to another drive, and bootable from any of them. Umount all mounted drives and reboot the system.

System folder backup with rsync

The dd method is a direct one, with a perfect clone of the partitions, but has some problems: 1. It requires a destination with free space at least the same as the partition that is being cloned. 2. The UUID problem described above. 3. Works better when the system is not mounted. 4. Time costly on the operation, depending on the size of the drive. 5. If there's errors on the clone process, all disk could become unable to restore.

As usual, in Linux there are multiple ways to solve a problem, and another backup utility is the rsync, a tool that copies data from one location to another, with some options very useful: 1. Can copy files keeping all file/folder permissions and attributes. 2. Only copies files that are different on the server, which greatly decreases the size of the data to copy in frequent backups 3. Allow excluding patterns from the files/folders to copy

The basic command to copy from one location to another is:

#!bash
sudo rsync -aAXv /source/folder /destination/folder --progress

Or if using ssh to transfer files to a remote location:

#!bash
sudo rsync -aAXv /source/folder user@hostname:/destination/folder --progress

The flags in the argument means: - -a: archive, that does not means it will create an archive of the file, it only means that it will copy recursively all folders and files, keeping permissions and ownership - -AX: two arguments that together sets the copy to keep all creation dates, modified dates, folder/files attributes, etc. - -v: verbose command, to output what is doing - --progress: prints a percentage of the data transfer, useful the most on remote transfers

Using ssh to transfer files to a remote location, but with different port or any other custom configuration:

#!bash
sudo rsync -aAXv -e "ssh -p 7227" /source/folder hostname:/destination/folder --progress

With the -e flag, one can set the complete ssh connection command with the desired flags.

A great resource to explain this method can be found on this video,

Excluding folders

One useful option of the rsync command is to set folder/file exclusion. This is specially useful when backing up all system, bet some files/folders should not be copied, because they are created, for example, at system boot. So, the most common use of this command to create a full system backup is the follow:

#!bash
rsync -aAXvp --delete --dry-run --exclude=/dev/* --exclude=/proc/*
--exclude=/sys/* --exclude=/tmp/* --exclude=/run/* --exclude=/mnt/*
--exclude=/media/* --exclude="swapfile" --exclude="lost+found"
--exclude=".cache" /source /dest

Or using a file with the folders/file to exclude:

#!bash
rsync -aAXvp --delete --dry-run --exclude-from exclude_options
/source /dest

Being the exclude_options file contents:

dev/*
proc/*
sys/*
tmp/*
run/*
mnt/*
media/*
swapfile
lost+found

This option is useful because it allows for a complete system backup to anywhere, including a remote computer. The big advantage of this method compared with the dd is that this only copies the folders/files that are necessary, which grants a small space needed for each backup. It also speeds up the backup, because if it is made always for the same location, it will only copies the files that changed from the previous run.

However there is a limitation to this command. Both source and destination needs to be accessed by root, or else the files ownership will not be maintained. This means that if the backup is being made to a remote location, both needs to be logged in as root, which could be a security issue. Another problem, is that all files/folder will be in a folder, with the same structure, and not in an archive able to easily be copied from on place to another

System folder backup with tar

Tar could very well be the oldest method to create system backups. It creates tar archives, that were initially used on the time that tapes was used as storage. It works, in some ways, similar to the rsync method, and can be used with the following command:

#!bash
tar --exclude-from=exclude_file --acls --xattrs -czpvf backupfile.tar.gz /

A great resource to explain this method can be found on this video.

There is no much to know of the used flags, the --acls and the --xattrs flags are used to keep all files and folders attributes, and in the group -czpvb there is: - c: to create the archive. - z: to compress the archive file using the gzip algorithm. - p: to preserve permissions of all files and folders. - v: to make the process verbose and list all the operations. - f: to specify the name of the file to use in the archive.

System restore from tar archive

The command to restore from a tar file is the same one used for creating the file but replacing the create flag (c) for the extract flag (x). It's useful to have the consideration that the exclude arguments to exclude some file/folders do not need to be used, because it's an extraction.

#!bash
tar --acls --xattrs -xzpvf backupfile.tar.gz -C /

The -C flag tells the tar command to change directory before extraction.

If the restore process is made to another filesystem that was previously prepared be aware that it will be necessary to make the changes to fstab and restore the bootloader. Refer to the guide in the dd section of this guide.

System Backup