Setting Up RAID using mdadm on Existing Drive

After experiencing a hard-disk failure (luckily no important stuff loss, just some backups), I’ve decided to setup a RAID1 array on my existing Ubuntu 12.04 installation. The important thing was to migrate my existing data to the new RAID array while retaining all the data. The easy solution would have been to setup the array on two new drives and then copy my data over. However, I did not have a spare drive (apart from the new one) to copy my data over while creating the RAID array, so I had to take the trickier way.

I mainly followed François Marier’s excellent tutorial. As I went through it I realized I had to adjust a few things either to make it work on Ubuntu 12.04 or because I preferred another way to do stuff.

I’ve check the steps below using Ubuntu 12.04 on both a physical and a virtual machine (albeit in the dumb order – first I risked my data and then decided to prefect the process on a VM :-)). I think the same steps should apply to other Debian derivatives and more recent Ubuntu versions as well.

Outline

Before diving into action, I want to outline the whole process. In the first step we will create a degraded RAID1 array, which means a RAID1 array with one of the drives missing, using only the new drive. Next we will config the system to be able to boot from the new degraded RAID1 array and copy the data from the old drive to the RAID1 array on the new drive. Afterwards, we will reboot the system using the degraded array and add the old drive to the array, thus making it no longer degraded. At this point, we will update again some configurations to make things permanent and finally we will test the setup.

Make sure you got backups of your important stuff before proceeding. Most likely you won’t need them, like I didn’t, but just in case.

Partitioning the Drive

For the rest of the tutorial, I’ll assume the old disk, the one with existing data, is /dev/sda and the new one is /dev/sdb/. I’ll also assume /dev/sda1 is the root partition and /dev/sda2 is the swap partition. If you have more partitions or your layout is different, just make sure you adjust the instructions accordingly.

The first step is to create partitions on the new disk that match the size of the partitions we would like to mirror on the old disk. This can be done using fdisk, parted or using GUI tools such as Ubuntu’s Disk utility or gparted.

If both disks are the same size and you want to mirror all the partitions, the easiest way to do so is to copy the partition table using sfdisk:

# sfdisk -d /dev/sda > partition_table
# sfdisk /dev/sdb < partition_table

This will only work if your partition table is MBR (as sfdisk doesn’t understand GPT). Before running the second command take a look at partition_table to make sure everything seems normal. If your using GPT drives with more than 2TB, see Asif’s comment regarding sgdisk.

You don’t need to bother setting the “raid” flag on your partitions like some people suggest. mdadm will scan all of your partitions regardless of that flag. Likewise, the “boot” flag isn’t needed on any of the partitions.

Creating the RAID Array

If you haven’t installed mdadm so far, do it:

# apt-get install mdadm

We create a degraded RAID1 array with the new drive. Usually a degraded RAID array is a result of malfunction, but we do it intentionally. We do so, because it allows us to have an operational RAID array which we can copy our data into and then add the old drive to the array and sync it.

# mdadm --create root --level=1 --raid-devices=2 missing /dev/sdb1  
# mdadm --create swap --level=1 --raid-devices=2 missing /dev/sdb2

These commands instructs mdadm to create a RAID1 array with two drives where one of the drives is missing. A separate array is created for the root and swap partitions. As you can see, I decided to put have my swap on RAID as well. There are different opinions on the matter. The main advantage is that your system will be able to survive one of the disk failing while the system is running. The disadvantage is that it wastes space. Performance wise, RAID isn’t better as might be expected, as Linux supports stripping (like RAID0) if it has swap partitions on two disks. In my case, I have plenty of RAM available and swap space is mainly unused, so I guessed I’m better of using RAID1 for the swap as well.

You may encounter the following warning when creating the arrays:

mdadm: Note: this array has metadata at the start and
    may not be suitable as a boot device.  If you plan to
    store '/boot' on this device please ensure that
    your boot-loader understands md/v1.x metadata, or use
    --metadata=0.90
Continue creating array?

Grub 1.99, which is the default bootloader in recent Ubuntu distributions supports booting from partitions with the 1.2 format metadata, so it’s safe to type “y” here.

Next, we need to create a filesystems on the newly created RAID arrays:

# mkfs.ext4 /dev/md/root
# mkswap /dev/md/swap

The following will record your newly created MD arrays in mdadm.conf:

# /usr/share/mdadm/mkconf > /etc/mdadm/mdadm.conf

Preparing to Boot the Array

In this step we shall prepare the system to boot the newly created boot array. Of course we won’t actully do that before copying our data into it.

Start by editing /etc/grub.d/40_custom and adding a new entry to boot the raid array. The easiest way is to copy the latest boot stanza from /boot/grub/grub.cfg and modify it. The boot stanza looks something like this:

menuentry 'Ubuntu, with Linux 3.2.0-56-generic' --class ubuntu --class gnu-linux --class gnu --class os {
        recordfail
        gfxmode $linux_gfx_mode
        insmod gzio
        insmod part_msdos
        insmod ext2
        set root='(hd0,msdos1)'
        search --no-floppy --fs-uuid --set=root 19939b0e-4272-40e0-846b-8bbe49e4a02c
        linux   /boot/vmlinuz-3.2.0-56-generic root=UUID=19939b0e-4272-40e0-846b-8bbe49e4a02c ro   quiet splash $vt_handoff
        initrd  /boot/initrd.img-3.2.0-56-generic
}

First we need to add

insmod raid
insmod mdraid1x

just after the rest of the insmod lines. This will load the necessary GRUB modules to detect your raid array during the bootprocess. If you decided to go for 0.9 metadata earlier (despite my recommendation…) you will need to load mdraid09 instead of mdraid1x. Next we need to modify the root partition. This is done my modifying the UUID (those random looking hex-and-hyphens strings) arguments to the lines starting with search and linux. To find out the UUID for your root partition run

# blkid /dev/md/root

Which will give something like

/dev/md/root: UUID="49b6f295-2fe3-48bb-bfb5-27171e015497" TYPE="ext4"

The set root line can be removed as the search line overrides it.

Last but not least add bootdegraded=true to the kernel parameters, which will allow you to boot the degraded array without any hassles. The result should look something like this:

menuentry 'Ubuntu, with Linux 3.2.0-56-generic (Raid)' --class ubuntu --class gnu-linux --class gnu --class os {
        recordfail
        gfxmode $linux_gfx_mode
        insmod gzio
        insmod part_msdos
        insmod ext2
    insmod raid
    insmod mdraid1x
        search --no-floppy --fs-uuid --set=root e9a36848-756c-414c-a20f-2053a17aba0f
        linux   /boot/vmlinuz-3.2.0-56-generic root=UUID=e9a36848-756c-414c-a20f-2053a17aba0f ro   quiet splash bootdegraded=true $vt_handoff
        initrd  /boot/initrd.img-3.2.0-56-generic
}

Now run update-grub as root so it actually updates the /boot/grub/grub.cfg file. Afterwards, run

# update-initramfs -u -k all

This will make sure that the updated mdadm.conf is put into the initramfs. If you don’t do so the names of your new RAID arrays will be a mess after reboot.

Copying the Data

Before booting the new (degraded) array, we need to copy our data into it. First mount /dev/md/root somewhere, say /mnt/root, and then copy the old data into it.

# rsync -auxHAX --exclude=/proc/* --exclude=/sys/* --exclude=/tmp/* / /mnt/root

Next you need to update /mnt/root/etc/fstab with the UUIDs of the new partition (which you can get using blkid). If you have encrypted swap, you should also update /mnt/root/etc/crypttab.

Last this before the reboot is to re-install the bootloader on both drives:

# grub-install /dev/sda
# grub-install /dev/sdb

Reboot the computer. Hold the “Shift” key while booting to force the Grub menu to appear. Select the new Grub menu-entry you have just added (should be last on the list). After the system finished booting up, verify that you’re indeed running from the RAID device by running mount, which should show a line like this:

/dev/md127 on / type ext4 (rw,errors=remount-ro)

The number after /dev/md doesn’t matter, as long as it’s /dev/md and not /dev/sda or other real disk device.

Completing the RAID Array

If you have made it that far, you have a running system with all your data on a degraded RAID array which consists of your new drive. The next step will be to add the old disk to the RAID array. This will delete any existing data on it. So take a few minutes to make sure that you’re not missing any files (this should be fine as we rsync‘ed the data). Adding the old disk back to the RAID array is done by:

# mdadm /dev/md/root -a /dev/sda1
# mdadm /dev/md/swap -a /dev/sda2

Make sure you are adding the right partitions to the right arrays. These commands instruct mdadm to add the old disk to the new arrays. It might take some time to complete syncing the drives. You can track the progress of building the RAID array using:

$ watch cat /proc/mdstat

When it’s done, it means that your RAID arrays are up and running and are no longer degraded.

Remove the boot stanza we’ve added to /etc/grub.d/40_custom and edit /etc/default/grub to add bootdegraded=true to the GRUB_CMDLINE_LINUX_DEFAULT configuration variable. This will cause your system to boot up even if the RAID array gets degraded, which prevent the bug outlined in Ubuntu Freezes When Booting with Degraded Raid.

Finally update Grub and re-install it:

# update-grub
# grub-install /dev/sda
# grub-install /dev/sdb

We are done! Your RAID array should be up and running.

Testing the Setup

Just getting the RAID array to work is good but not enough. As you probably wanted the RAID array as contingency plan, you probably want to test it to make sure it works as intended.

We make sure that the system is able to work in case on of the drives fails. Shut down the system and disconnect one of the drives, say sda. The system should boot fine due to the RAID array, but cat /proc/mdstat should show one of the drives missing.

To restore normal operation, shutdown the system and reconnect the drive before booting it back up. Now re-add the drive to the RAID arrays.

mdadm /dev/md/root -a /dev/sda1
mdadm /dev/md/swap -a /dev/sda2

Again this might take some time. You can view the progress using watch cat /proc/mdstat.

Ubuntu Freezes When Booting with Degraded Raid

I tried testing my software raid (mdadm) setup by removing one of the disks. When I tried to boot the degraded system, the system hanged displaying a purple screen. If I try booting the system in recovery mode, I get the following error:

** WARNING: There appears to be one or more degraded RAID devices ** The system my have suffered a hardware fault, such as a disk drive failure. The root device may depend on the RAID devices being online. Do you wish to start the degraded RAID? [y/N]:
** WARNING: There appears to be one or more degraded RAID devices **
The system my have suffered a hardware fault, such as a disk drive failure. The root device may depend on the RAID devices being online.
Do you wish to start the degraded RAID? [y/N]:
Continue reading Ubuntu Freezes When Booting with Degraded Raid