Intro
This post provides a guide to backup and restore a Linux system, that can be archived by several methods. Here are provided 3 methods, all Linux native, or needing a simple package install.
Full disk/partition clone
The easiest method to perform a system backup is with a full partition or disk
clone, which can be done with the command dd
. The
Arch Wiki
has some great examples on how to use of this command, being the main ones
presented below.
#!bash
dd if=/dev/in of=/dev/out bs=64k conv=noerror,sync status=progress
The example above copies all data blocks from the if argument to the
of. Both these arguments could be a file, a partition (for example
/dev/sda1, or a full disk (for example /dev/sda). The bs field
defines the size of the block used to copy, which should match the disk cache to
optimal copy speed. The conv field specifies that should continue the copy
if encounters any error (argument noerror), and that should keep the block
offset, filling errors with 0. This means that a block size should be
chosen to avoid loss of a great data information in case of an error. The
status=progress argument, tells dd
to print a status of how much data
was copied and which is the operation completion percentage.
An example of a full disk clone (between two drives) is as follows:
#!bash
dd if=/dev/sda of=/dev/sdb bs=64k conv=noerror,sync status=progress
The copy will stop if the of argument gets filled, so it should be assured that the drives are the same size.
IMPORTANT NOT: After running this command, there is no confirmation screen. This means, be sure to provide the correct input (if) and output (of) arguments, to not delete all data by mistake.
Full disk/partition image file backup
The previous example shows how to clone disk/partition to another drive, but it
could be the case that the system does not have a second drive to perform the
clone. With the dd
tool is possible to make a clone to a image file, so it
could be saved in another location, and restored when needed. To perform this
type of backup, in addiction to the dd
tool, it is necessary to use the
gzip
tool to create the image. The full command is:
#!bash
dd if=/dev/sda conv=sync,noerror bs=64K status=progress | gzip -c > /path/to/backup.img.gz
This command will read a block from the dev/sda drive and pass it to the
gzip
tool that creates the .img.gz
file.
#!bash
dd if=/dev/sda conv=sync,noerror bs=64K | gzip -c | split -a3 -b2G - /path/to/backup.img.gz
The above command is the same as the previous, but split the compressed file into smaller files, useful for use with, for example, fat32 file systems that have the file size limit. The arguments are: -a3 specifies the files suffix to have a length of 3; -b2G specifies the maximum file size to 2Gb.
To have full information of the drive geometry of the backup, save it to a file with the command:
#!bash
fdisk -l /dev/sda > /path/to/list_fdisk.info
Restore from image file backup
When time to restore a image file backup, created by the steps described in the previous section, arrives, this can be archived by the command:
#!bash
gunzip -c /path/to/backup.img.gz | dd of=/dev/sda
If the image file backup was created with the split files variation, then the command should be:
#!bash
cat /path/to/backup.img.gz* | gunzip -c | dd of=/dev/sda
Dealing with partition UUID after clone
With the operations of this dd
command, a full clone of the data will be made,
which means the disks will be equal. The Linux system mounts its filesystem
points usually by UUID of the disks. This brings a problem if both drives are
always connected on the system, which is that all the cloned partitions will
have the same UUID, and the Linux system will mount whatever it detects first on
the boot sequence. If both disks must be connected to the system, it will be
better to reset the UUIDs of the cloned ones.
Running the following command, will output the UUIDs for each partition on the system:
#!bash
blkid|grep UUID
To change a UUID of a partition, it needs to be unmounted first, so, run the commands:
#!bash
umount /dev/sdb1
tune2fs /dev/sdb1 -U random
Then, the command blkid
could be run again to check for the UUID change. This
changes the partition UUID, but now the system will not boot up from that drive,
because it was cloned and is pointing to the previous UUID. This is fixed by
changing the UUID pointed by the /etc/fstab
of that system partition. Just
open the file and change the value to the new one (or ones if there are multiple
partitions in use).
It's possible that this is not enough, sometimes after, this changes, the system
can still not boot, because the bootloader is still pointing to another UUID.
This is a little more complex to solve, but it is still possible using the
chroot
tool, with the steps bellow:
1. Boot into a Linux system. Could be the original one that was cloned or a live
distro Linux.
2. Mount the filesystem that changed the UUID, as example:
#!bash
mkdir -p /mnt/system
mount /dev/sdb1 /mnt/system
- Be aware that if there's a different partition to the boot folder (for example in UEFI systems), it should also be mounted
#!bash
mkdir -p /mnt/system/boot
mount /dev/sdb2 /mnt/system/boot
- For bootloader generation it should be needed the system components mounted on the system boot, that are not on the mounted system. To mount them, run:
#!bash
mount --bind /dev /mnt/system/dev
mount --bind /proc /mnt/system/proc
mount --bind /sys /mnt/system/sys
- Login into the mounted system with the following command:
#!bash
chroot /mnt/system/
- Now that you are on the mounted system, regenerate the bootloader, with the following commands:
#!bash
grub-mkconfig
update-grub
This should be enough to have the same system cloned to another drive, and bootable from any of them. Umount all mounted drives and reboot the system.
System folder backup with rsync
The dd
method is a direct one, with a perfect clone of the partitions, but has
some problems:
1. It requires a destination with free space at least the same as the partition
that is being cloned.
2. The UUID problem described above.
3. Works better when the system is not mounted.
4. Time costly on the operation, depending on the size of the drive.
5. If there's errors on the clone process, all disk could become unable to
restore.
As usual, in Linux there are multiple ways to solve a problem, and another
backup utility is the rsync
, a tool that copies data from one location to
another, with some options very useful:
1. Can copy files keeping all file/folder permissions and attributes.
2. Only copies files that are different on the server, which greatly decreases
the size of the data to copy in frequent backups
3. Allow excluding patterns from the files/folders to copy
The basic command to copy from one location to another is:
#!bash
sudo rsync -aAXv /source/folder /destination/folder --progress
Or if using ssh to transfer files to a remote location:
#!bash
sudo rsync -aAXv /source/folder user@hostname:/destination/folder --progress
The flags in the argument means: - -a: archive, that does not means it will create an archive of the file, it only means that it will copy recursively all folders and files, keeping permissions and ownership - -AX: two arguments that together sets the copy to keep all creation dates, modified dates, folder/files attributes, etc. - -v: verbose command, to output what is doing - --progress: prints a percentage of the data transfer, useful the most on remote transfers
Using ssh to transfer files to a remote location, but with different port or any other custom configuration:
#!bash
sudo rsync -aAXv -e "ssh -p 7227" /source/folder hostname:/destination/folder --progress
With the -e flag, one can set the complete ssh connection command with the desired flags.
A great resource to explain this method can be found on this video,
Excluding folders
One useful option of the rsync
command is to set folder/file exclusion. This
is specially useful when backing up all system, bet some files/folders should
not be copied, because they are created, for example, at system boot. So, the
most common use of this command to create a full system backup is the follow:
#!bash
rsync -aAXvp --delete --dry-run --exclude=/dev/* --exclude=/proc/*
--exclude=/sys/* --exclude=/tmp/* --exclude=/run/* --exclude=/mnt/*
--exclude=/media/* --exclude="swapfile" --exclude="lost+found"
--exclude=".cache" /source /dest
Or using a file with the folders/file to exclude:
#!bash
rsync -aAXvp --delete --dry-run --exclude-from exclude_options
/source /dest
Being the exclude_options
file contents:
dev/*
proc/*
sys/*
tmp/*
run/*
mnt/*
media/*
swapfile
lost+found
This option is useful because it allows for a complete system backup to
anywhere, including a remote computer. The big advantage of this method compared
with the dd
is that this only copies the folders/files that are necessary,
which grants a small space needed for each backup. It also speeds up the backup,
because if it is made always for the same location, it will only copies the
files that changed from the previous run.
However there is a limitation to this command. Both source and destination needs to be accessed by root, or else the files ownership will not be maintained. This means that if the backup is being made to a remote location, both needs to be logged in as root, which could be a security issue. Another problem, is that all files/folder will be in a folder, with the same structure, and not in an archive able to easily be copied from on place to another
System folder backup with tar
Tar could very well be the oldest method to create system backups. It
creates tar archives, that were initially used on the time that tapes was used
as storage. It works, in some ways, similar to the rsync
method, and can be
used with the following command:
#!bash
tar --exclude-from=exclude_file --acls --xattrs -czpvf backupfile.tar.gz /
A great resource to explain this method can be found on this video.
There is no much to know of the used flags, the --acls
and the --xattrs
flags
are used to keep all files and folders attributes, and in the group
-czpvb
there is:
- c: to create the archive.
- z: to compress the archive file using the gzip algorithm.
- p: to preserve permissions of all files and folders.
- v: to make the process verbose and list all the operations.
- f: to specify the name of the file to use in the archive.
System restore from tar archive
The command to restore from a tar file is the same one used for creating the file but replacing the create flag (c) for the extract flag (x). It's useful to have the consideration that the exclude arguments to exclude some file/folders do not need to be used, because it's an extraction.
#!bash
tar --acls --xattrs -xzpvf backupfile.tar.gz -C /
The -C
flag tells the tar command to change directory before extraction.
If the restore process is made to another filesystem that was previously prepared be aware that it will be necessary to make the changes to fstab and restore the bootloader. Refer to the guide in the dd section of this guide.