Friday, September 25, 2009

How To Set Up Software RAID1 On A Running System (Incl. GRUB Configuration) (CentOS 5.3)

This guide explains how to set up software RAID1 on an already running CentOS 5.3 system. The GRUB bootloader will be configured in such a way that the system will still be able to boot if one of the hard drives fails (no matter which one).

I do not issue any guarantee that this will work for you!

1 Preliminary Note
In this tutorial I'm using a CentOS 5.3 system with two hard drives, /dev/sda and /dev/sdb which are identical in size. /dev/sdb is currently unused, and /dev/sda has the following partitions:
  • /dev/sda1: /boot partition, ext3;
  • /dev/sda2: swap;
  • /dev/sda3: / partition, ext3
In the end I want to have the following situation:
  • /dev/md0 (made up of /dev/sda1 and /dev/sdb1): /boot partition, ext3;
  • /dev/md1 (made up of /dev/sda2 and /dev/sdb2): swap;
  • /dev/md2 (made up of /dev/sda3 and /dev/sdb3): / partition, ext3
This is the current situation:

[root@server1 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda3             9.1G  1.1G  7.6G  12% /
/dev/sda1             190M   12M  169M   7% /boot
tmpfs                 252M     0  252M   0% /dev/shm


[root@server1 ~]# fdisk -l

Disk /dev/sda: 10.7 GB, 10737418240 bytes
255 heads, 63 sectors/track, 1305 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          25      200781   83  Linux
/dev/sda2              26          90      522112+  82  Linux swap / Solaris
/dev/sda3              91        1305     9759487+  83  Linux

Disk /dev/sdb: 10.7 GB, 10737418240 bytes
255 heads, 63 sectors/track, 1305 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdb doesn't contain a valid partition table
[root@server1 ~]#

2 Installing mdadm

The most important tool for setting up RAID is mdadm. Let's install it like this:

# yum install mkinitrd mdadm

Afterwards, we load a few kernel modules (to avoid a reboot):

# modprobe linear
# modprobe multipath
# modprobe raid0
# modprobe raid1
# modprobe raid5
# modprobe raid6
# modprobe raid10

Now run
# cat /proc/mdstat

The output should look as follows:
[root@server1 ~]# cat /proc/mdstat


Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
unused devices:


[root@server1 ~]#

3 Preparing /dev/sdb

To create a RAID1 array on our already running system, we must prepare the /dev/sdb hard drive for RAID1, then copy the contents of our /dev/sda hard drive to it, and finally add /dev/sda to the RAID1 array.

First, we copy the partition table from /dev/sda to /dev/sdb so that both disks have exactly the same layout:
sfdisk -d /dev/sda | sfdisk /dev/sdb

The output should be as follows:

[root@server1 ~]# sfdisk -d /dev/sda | sfdisk /dev/sdb


Checking that no-one is using this disk right now ...
OK

Disk /dev/sdb: 1305 cylinders, 255 heads, 63 sectors/track

sfdisk: ERROR: sector 0 does not have an msdos signature
 /dev/sdb: unrecognized partition table type


Old situation:
No partitions found


New situation:
Units = sectors of 512 bytes, counting from 0

   Device Boot    Start       End   #sectors  Id  System
/dev/sdb1   *        63    401624     401562  83  Linux
/dev/sdb2        401625   1445849    1044225  82  Linux swap / Solaris
/dev/sdb3       1445850  20964824   19518975  83  Linux
/dev/sdb4             0         -          0   0  Empty


Successfully wrote the new partition table

Re-reading the partition table ...

If you created or changed a DOS partition, /dev/foo7, say, then use dd(1)
to zero the first 512 bytes:  dd if=/dev/zero of=/dev/foo7 bs=512 count=1


(See fdisk(8).)


[root@server1 ~]#

The command fdisk -l should now show that both HDDs have the same layout:

[root@server1 ~]# fdisk -l

Disk /dev/sda: 10.7 GB, 10737418240 bytes
255 heads, 63 sectors/track, 1305 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          25      200781   83  Linux
/dev/sda2              26          90      522112+  82  Linux swap / Solaris
/dev/sda3              91        1305     9759487+  83  Linux

Disk /dev/sdb: 10.7 GB, 10737418240 bytes
255 heads, 63 sectors/track, 1305 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1          25      200781   83  Linux
/dev/sdb2              26          90      522112+  82  Linux swap / Solaris
/dev/sdb3              91        1305     9759487+  83  Linux


[root@server1 ~]#

Next we must change the partition type of our three partitions on /dev/sdb to Linux raid autodetect:

[root@server1 ~]# fdisk /dev/sdb

The number of cylinders for this disk is set to 1305.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:


1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs (e.g., DOS FDISK, OS/2 FDISK)


Command (m for help):
 <-- m
Command action
   a   toggle a bootable flag
   b   edit bsd disklabel
   c   toggle the dos compatibility flag
   d   delete a partition
   l   list known partition types
   m   print this menu
   n   add a new partition
   o   create a new empty DOS partition table
   p   print the partition table
   q   quit without saving changes
   s   create a new empty Sun disklabel
   t   change a partition's system id
   u   change display/entry units
   v   verify the partition table
   w   write table to disk and exit
   x   extra functionality (experts only)

Command (m for help):
 <-- t
Partition number (1-4): <-- 1
Hex code (type L to list codes): <-- L

 0  Empty           1e  Hidden W95 FAT1 80  Old Minix       bf  Solaris
 1  FAT12           24  NEC DOS         81  Minix / old Lin c1  DRDOS/sec (FAT-
 2  XENIX root      39  Plan 9          82  Linux swap / So c4  DRDOS/sec (FAT-
 3  XENIX usr       3c  PartitionMagic  83  Linux           c6  DRDOS/sec (FAT-
 4  FAT16 <32M      40  Venix 80286     84  OS/2 hidden C:  c7  Syrinx
 5  Extended        41  PPC PReP Boot   85  Linux extended  da  Non-FS data
 6  FAT16           42  SFS             86  NTFS volume set db  CP/M / CTOS / .
 7  HPFS/NTFS       4d  QNX4.x          87  NTFS volume set de  Dell Utility
 8  AIX             4e  QNX4.x 2nd part 88  Linux plaintext df  BootIt
 9  AIX bootable    4f  QNX4.x 3rd part 8e  Linux LVM       e1  DOS access
 a  OS/2 Boot Manag 50  OnTrack DM      93  Amoeba          e3  DOS R/O
 b  W95 FAT32       51  OnTrack DM6 Aux 94  Amoeba BBT      e4  SpeedStor
 c  W95 FAT32 (LBA) 52  CP/M            9f  BSD/OS          eb  BeOS fs
 e  W95 FAT16 (LBA) 53  OnTrack DM6 Aux a0  IBM Thinkpad hi ee  EFI GPT
 f  W95 Ext'd (LBA) 54  OnTrackDM6      a5  FreeBSD         ef  EFI (FAT-12/16/
10  OPUS            55  EZ-Drive        a6  OpenBSD         f0  Linux/PA-RISC b
11  Hidden FAT12    56  Golden Bow      a7  NeXTSTEP        f1  SpeedStor
12  Compaq diagnost 5c  Priam Edisk     a8  Darwin UFS      f4  SpeedStor
14  Hidden FAT16 <3 61  SpeedStor       a9  NetBSD          f2  DOS secondary
16  Hidden FAT16    63  GNU HURD or Sys ab  Darwin boot     fb  VMware VMFS
17  Hidden HPFS/NTF 64  Novell Netware  b7  BSDI fs         fc  VMware VMKCORE
18  AST SmartSleep  65  Novell Netware  b8  BSDI swap       fd  Linux raid auto
1b  Hidden W95 FAT3 70  DiskSecure Mult bb  Boot Wizard hid fe  LANstep
1c  Hidden W95 FAT3 75  PC/IX           be  Solaris boot    ff  BBT
Hex code (type L to list codes):
 <-- fd
Changed system type of partition 1 to fd (Linux raid autodetect)

Command (m for help):
 <-- t
Partition number (1-4): <-- 2
Hex code (type L to list codes): <-- fd
Changed system type of partition 2 to fd (Linux raid autodetect)

Command (m for help):
 <-- t
Partition number (1-4): <-- 3
Hex code (type L to list codes): <-- fd
Changed system type of partition 3 to fd (Linux raid autodetect)

Command (m for help):
 <-- w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.


[root@server1 ~]#


To make sure that there are no remains from previous RAID installations on /dev/sdb, we run the following commands:

# mdadm --zero-superblock /dev/sdb1
# mdadm --zero-superblock /dev/sdb2
# mdadm --zero-superblock /dev/sdb3

If there are no remains from previous RAID installations, each of the above commands will throw an error like this one (which is nothing to worry about):

[root@server1 ~]# mdadm --zero-superblock /dev/sdb1
mdadm: Unrecognised md component device - /dev/sdb1


[root@server1 ~]#
Otherwise the commands will not display anything at all.


4 Creating Our RAID Arrays

Now let's create our RAID arrays /dev/md0, /dev/md1, and /dev/md2. /dev/sdb1 will be added to /dev/md0, /dev/sdb2 to /dev/md1, and /dev/sdb3 to /dev/md2. /dev/sda1, /dev/sda2, and /dev/sda3 can't be added right now (because the system is currently running on them), therefore we use the placeholder missing in the following three commands:

# mdadm --create /dev/md0 --level=1 --raid-disks=2 missing /dev/sdb1
# mdadm --create /dev/md1 --level=1 --raid-disks=2 missing /dev/sdb2
# mdadm --create /dev/md2 --level=1 --raid-disks=2 missing /dev/sdb3

The command cat /proc/mdstat should now show that you have three degraded RAID arrays ([_U] or [U_] means that an array is degraded while [UU] means that the array is ok):

[root@server1 ~]# cat /proc/mdstat


Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md2 : active raid1 sdb3[1]
      9759360 blocks [2/1] [_U]

md1 : active raid1 sdb2[1]
      522048 blocks [2/1] [_U]

md0 : active raid1 sdb1[1]
      200704 blocks [2/1] [_U]

unused devices: 


[root@server1 ~]#

Next we create filesystems on our RAID arrays (ext3 on /dev/md0 and /dev/md2 and swap on /dev/md1):

# mkfs.ext3 /dev/md0
# mkswap /dev/md1
# mkfs.ext3 /dev/md2

Next we create /etc/mdadm.conf as follows:

# mdadm --examine --scan > /etc/mdadm.conf

Display the contents of the file:

# cat /etc/mdadm.conf

In the file you should now see details about our three (degraded) RAID arrays:

ARRAY /dev/md0 level=raid1 num-devices=2 UUID=78d582f0:940fabb5:f1c1092a:04a55452
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=8db8f7e1:f2a64674:d22afece:4a539aa7
ARRAY /dev/md2 level=raid1 num-devices=2 UUID=1baf282d:17c58efd:a8de6947:b0af9792

5 Adjusting The System To RAID1

Now let's mount /dev/md0 and /dev/md2 (we don't need to mount the swap array /dev/md1):

# mkdir /mnt/md0
# mkdir /mnt/md2
# mount /dev/md0 /mnt/md0
# mount /dev/md2 /mnt/md2

You should now find both arrays in the output of mount

[root@server1 ~]# mount


/dev/sda3 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/sda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
/dev/md0 on /mnt/md0 type ext3 (rw)
/dev/md2 on /mnt/md2 type ext3 (rw)


[root@server1 ~]#

Next we modify /etc/fstab. Replace LABEL=/boot with /dev/md0, LABEL=SWAP-sda2 with /dev/md1, and LABEL=/ with /dev/md2 so that the file looks as follows:

vi /etc/fstab
/dev/md2                 /                       ext3    defaults        1 1
/dev/md0             /boot                   ext3    defaults        1 2
tmpfs                   /dev/shm                tmpfs   defaults        0 0
devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
sysfs                   /sys                    sysfs   defaults        0 0
proc                    /proc                   proc    defaults        0 0
/dev/md1         swap                    swap    defaults        0 0
Next replace /dev/sda1 with /dev/md0 and /dev/sda3 with /dev/md2 in /etc/mtab:
vi /etc/mtab
/dev/md2 / ext3 rw 0 0
proc /proc proc rw 0 0
sysfs /sys sysfs rw 0 0
devpts /dev/pts devpts rw,gid=5,mode=620 0 0
/dev/md0 /boot ext3 rw 0 0
tmpfs /dev/shm tmpfs rw 0 0
none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0
sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0
Now up to the GRUB boot loader. Open /boot/grub/menu.lst and add fallback=1 right after default=0:
vi /boot/grub/menu.lst
[...]
default=0
fallback=1
[...]
This makes that if the first kernel (counting starts with 0, so the first kernel is 0) fails to boot, kernel #2 will be booted.
In the same file, go to the bottom where you should find some kernel stanzas. Copy the first of them and paste the stanza before the first existing stanza; replace root=LABEL=/ with root=/dev/md2 and root (hd0,0) with root (hd1,0):
[...]
title CentOS (2.6.18-128.el5)
        root (hd1,0)
        kernel /vmlinuz-2.6.18-128.el5 ro root=/dev/md2
        initrd /initrd-2.6.18-128.el5.img

title CentOS (2.6.18-128.el5)
        root (hd0,0)
        kernel /vmlinuz-2.6.18-128.el5 ro root=LABEL=/
        initrd /initrd-2.6.18-128.el5.img
The whole file should look something like this:
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE:  You have a /boot partition.  This means that
#          all kernel and initrd paths are relative to /boot/, eg.
#          root (hd0,0)
#          kernel /vmlinuz-version ro root=/dev/sda3
#          initrd /initrd-version.img
#boot=/dev/sda
default=0
fallback=1
timeout=5
splashimage=(hd0,0)/grub/splash.xpm.gz
hiddenmenu
title CentOS (2.6.18-128.el5)
        root (hd1,0)
        kernel /vmlinuz-2.6.18-128.el5 ro root=/dev/md2
        initrd /initrd-2.6.18-128.el5.img

title CentOS (2.6.18-128.el5)
        root (hd0,0)
        kernel /vmlinuz-2.6.18-128.el5 ro root=LABEL=/
        initrd /initrd-2.6.18-128.el5.img
root (hd1,0) refers to /dev/sdb which is already part of our RAID arrays. We will reboot the system in a few moments; the system will then try to boot from our (still degraded) RAID arrays; if it fails, it will boot from /dev/sda (-> fallback 1).
Next we adjust our ramdisk to the new situation:
mv /boot/initrd-`uname -r`.img /boot/initrd-`uname -r`.img_orig
mkinitrd /boot/initrd-`uname -r`.img `uname -r`
Now we copy the contents of /dev/sda1 and /dev/sda3 to /dev/md0 and /dev/md2 (which are mounted on /mnt/md0 and /mnt/md2):
cp -dpRx / /mnt/md2
cd /boot
cp -dpRx . /mnt/md0

6 Preparing GRUB (Part 1)

Afterwards we must install the GRUB bootloader on the second hard drive /dev/sdb:
grub
On the GRUB shell, type in the following commands:
root (hd0,0)
grub> root (hd0,0)
 Filesystem type is ext2fs, partition type 0x83

grub>
setup (hd0)
grub> setup (hd0)
 Checking if "/boot/grub/stage1" exists... no
 Checking if "/grub/stage1" exists... yes
 Checking if "/grub/stage2" exists... yes
 Checking if "/grub/e2fs_stage1_5" exists... yes
 Running "embed /grub/e2fs_stage1_5 (hd0)"...  15 sectors are embedded.
succeeded
 Running "install /grub/stage1 (hd0) (hd0)1+15 p (hd0,0)/grub/stage2 /grub/grub.conf"... succeeded
Done.

grub>
root (hd1,0)
grub> root (hd1,0)
 Filesystem type is ext2fs, partition type 0xfd

grub>
setup (hd1)
grub> setup (hd1)
 Checking if "/boot/grub/stage1" exists... no
 Checking if "/grub/stage1" exists... yes
 Checking if "/grub/stage2" exists... yes
 Checking if "/grub/e2fs_stage1_5" exists... yes
 Running "embed /grub/e2fs_stage1_5 (hd1)"...  15 sectors are embedded.
succeeded
 Running "install /grub/stage1 (hd1) (hd1)1+15 p (hd1,0)/grub/stage2 /grub/grub.conf"... succeeded
Done.

grub>
quit
Now, back on the normal shell, we reboot the system and hope that it boots ok from our RAID arrays:
reboot


7 Preparing /dev/sda

If all goes well, you should now find /dev/md0 and /dev/md2 in the output of
df -h
[root@server1 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/md2              9.2G  1.1G  7.7G  12% /
/dev/md0              190M   14M  167M   8% /boot
tmpfs                 252M     0  252M   0% /dev/shm
[root@server1 ~]#
The output of
cat /proc/mdstat
should be as follows:
[root@server1 ~]# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdb1[1]
      200704 blocks [2/1] [_U]

md1 : active raid1 sdb2[1]
      522048 blocks [2/1] [_U]

md2 : active raid1 sdb3[1]
      9759360 blocks [2/1] [_U]

unused devices: 
[root@server1 ~]#

Now we must change the partition types of our three partitions on /dev/sda to Linux raid autodetect as well:
fdisk /dev/sda
[root@server1 ~]# fdisk /dev/sda

The number of cylinders for this disk is set to 1305.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)

Command (m for help):
 <-- t
Partition number (1-4): <-- 1
Hex code (type L to list codes): <-- fd
Changed system type of partition 1 to fd (Linux raid autodetect)

Command (m for help): 
<-- t
Partition number (1-4): <-- 2
Hex code (type L to list codes): <-- fd
Changed system type of partition 2 to fd (Linux raid autodetect)

Command (m for help):
 <-- t
Partition number (1-4): <-- 3
Hex code (type L to list codes): <-- fd
Changed system type of partition 3 to fd (Linux raid autodetect)

Command (m for help):
 <-- w
The partition table has been altered!

Calling ioctl() to re-read partition table.

WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table.
The new table will be used at the next reboot.
Syncing disks.
[root@server1 ~]#

Now we can add /dev/sda1, /dev/sda2, and /dev/sda3 to the respective RAID arrays:
mdadm --add /dev/md0 /dev/sda1
mdadm --add /dev/md1 /dev/sda2
mdadm --add /dev/md2 /dev/sda3
Now take a look at
cat /proc/mdstat
... and you should see that the RAID arrays are being synchronized:
[root@server1 ~]# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sda1[0] sdb1[1]
      200704 blocks [2/2] [UU]

md1 : active raid1 sda2[0] sdb2[1]
      522048 blocks [2/2] [UU]

md2 : active raid1 sda3[2] sdb3[1]
      9759360 blocks [2/1] [_U]
      [====>................]  recovery = 22.8% (2232576/9759360) finish=2.4min speed=50816K/sec

unused devices: 
[root@server1 ~]#

(You can run
watch cat /proc/mdstat
to get an ongoing output of the process. To leave watch, press CTRL+C.)
Wait until the synchronization has finished (the output should then look like this:
[root@server1 ~]# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sda1[0] sdb1[1]
      200704 blocks [2/2] [UU]

md1 : active raid1 sda2[0] sdb2[1]
      522048 blocks [2/2] [UU]

md2 : active raid1 sda3[0] sdb3[1]
      9759360 blocks [2/2] [UU]

unused devices: 
[root@server1 ~]#

).
Then adjust /etc/mdadm.conf to the new situation:
mdadm --examine --scan > /etc/mdadm.conf
/etc/mdadm.conf should now look something like this:
cat /etc/mdadm.conf
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=78d582f0:940fabb5:f1c1092a:04a55452
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=8db8f7e1:f2a64674:d22afece:4a539aa7
ARRAY /dev/md2 level=raid1 num-devices=2 UUID=1baf282d:17c58efd:a8de6947:b0af9792

8 Preparing GRUB (Part 2)

We are almost done now. Now we must modify /boot/grub/menu.lst again. Right now it is configured to boot from /dev/sdb (hd1,0). Of course, we still want the system to be able to boot in case /dev/sdb fails. Therefore we copy the first kernel stanza (which contains hd1), paste it below and replace hd1 with hd0. Furthermore we comment out all other kernel stanzas so that it looks as follows:
vi /boot/grub/menu.lst
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE:  You have a /boot partition.  This means that
#          all kernel and initrd paths are relative to /boot/, eg.
#          root (hd0,0)
#          kernel /vmlinuz-version ro root=/dev/sda3
#          initrd /initrd-version.img
#boot=/dev/sda
default=0
fallback=1
timeout=5
splashimage=(hd0,0)/grub/splash.xpm.gz
hiddenmenu
title CentOS (2.6.18-128.el5)
        root (hd1,0)
        kernel /vmlinuz-2.6.18-128.el5 ro root=/dev/md2
        initrd /initrd-2.6.18-128.el5.img

title CentOS (2.6.18-128.el5)
        root (hd0,0)
        kernel /vmlinuz-2.6.18-128.el5 ro root=/dev/md2
        initrd /initrd-2.6.18-128.el5.img

#title CentOS (2.6.18-128.el5)
#       root (hd0,0)
#       kernel /vmlinuz-2.6.18-128.el5 ro root=LABEL=/
#       initrd /initrd-2.6.18-128.el5.img
Afterwards, update your ramdisk:
mv /boot/initrd-`uname -r`.img /boot/initrd-`uname -r`.img_orig2
mkinitrd /boot/initrd-`uname -r`.img `uname -r`
... and reboot the system:
reboot
It should boot without problems.
That's it - you've successfully set up software RAID1 on your running CentOS 5.3 system!


9 Testing

Now let's simulate a hard drive failure. It doesn't matter if you select /dev/sda or /dev/sdb here. In this example I assume that /dev/sdb has failed.
To simulate the hard drive failure, you can either shut down the system and remove /dev/sdb from the system, or you (soft-)remove it like this:
mdadm --manage /dev/md0 --fail /dev/sdb1
mdadm --manage /dev/md1 --fail /dev/sdb2
mdadm --manage /dev/md2 --fail /dev/sdb3
mdadm --manage /dev/md0 --remove /dev/sdb1
mdadm --manage /dev/md1 --remove /dev/sdb2
mdadm --manage /dev/md2 --remove /dev/sdb3
Shut down the system:
shutdown -h now
Then put in a new /dev/sdb drive (if you simulate a failure of /dev/sda, you should now put /dev/sdb in /dev/sda's place and connect the new HDD as /dev/sdb!) and boot the system. It should still start without problems.
Now run
cat /proc/mdstat
and you should see that we have a degraded array:
[root@server1 ~]# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sda1[0]
      200704 blocks [2/1] [U_]

md1 : active raid1 sda2[0]
      522048 blocks [2/1] [U_]

md2 : active raid1 sda3[0]
      9759360 blocks [2/1] [U_]

unused devices: 
[root@server1 ~]#

The output of
fdisk -l
should look as follows:
[root@server1 ~]# fdisk -l

Disk /dev/sda: 10.7 GB, 10737418240 bytes
255 heads, 63 sectors/track, 1305 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          25      200781   fd  Linux raid autodetect
/dev/sda2              26          90      522112+  fd  Linux raid autodetect
/dev/sda3              91        1305     9759487+  fd  Linux raid autodetect

Disk /dev/sdb: 10.7 GB, 10737418240 bytes
255 heads, 63 sectors/track, 1305 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdb doesn't contain a valid partition table

Disk /dev/md2: 9993 MB, 9993584640 bytes
2 heads, 4 sectors/track, 2439840 cylinders
Units = cylinders of 8 * 512 = 4096 bytes

Disk /dev/md2 doesn't contain a valid partition table

Disk /dev/md1: 534 MB, 534577152 bytes
2 heads, 4 sectors/track, 130512 cylinders
Units = cylinders of 8 * 512 = 4096 bytes

Disk /dev/md1 doesn't contain a valid partition table

Disk /dev/md0: 205 MB, 205520896 bytes
2 heads, 4 sectors/track, 50176 cylinders
Units = cylinders of 8 * 512 = 4096 bytes

Disk /dev/md0 doesn't contain a valid partition table
[root@server1 ~]#

Now we copy the partition table of /dev/sda to /dev/sdb:

# sfdisk -d /dev/sda | sfdisk /dev/sdb

(If you get an error, you can try the --force option:)

# sfdisk -d /dev/sda | sfdisk --force /dev/sdb

[root@server1 ~]# sfdisk -d /dev/sda | sfdisk /dev/sdb


Checking that no-one is using this disk right now ...
OK

Disk /dev/sdb: 1305 cylinders, 255 heads, 63 sectors/track

sfdisk: ERROR: sector 0 does not have an msdos signature
 /dev/sdb: unrecognized partition table type


Old situation:
No partitions found


New situation:
Units = sectors of 512 bytes, counting from 0

   Device Boot    Start       End   #sectors  Id  System
/dev/sdb1   *        63    401624     401562  fd  Linux raid autodetect
/dev/sdb2        401625   1445849    1044225  fd  Linux raid autodetect
/dev/sdb3       1445850  20964824   19518975  fd  Linux raid autodetect
/dev/sdb4             0         -          0   0  Empty
Successfully wrote the new partition table

Re-reading the partition table ...

If you created or changed a DOS partition, /dev/foo7, say, then use dd(1)
to zero the first 512 bytes:  dd if=/dev/zero of=/dev/foo7 bs=512 count=1
(See fdisk(8).)


[root@server1 ~]#

Afterwards we remove any remains of a previous RAID array from /dev/sdb...

# mdadm --zero-superblock /dev/sdb1
# mdadm --zero-superblock /dev/sdb2
# mdadm --zero-superblock /dev/sdb3
... and add /dev/sdb to the RAID array:
# mdadm -a /dev/md0 /dev/sdb1
# mdadm -a /dev/md1 /dev/sdb2
# mdadm -a /dev/md2 /dev/sdb3

Now take a look at
cat /proc/mdstat
[root@server1 ~]# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdb1[1] sda1[0]
      200704 blocks [2/2] [UU]

md1 : active raid1 sdb2[1] sda2[0]
      522048 blocks [2/2] [UU]

md2 : active raid1 sdb3[2] sda3[0]
      9759360 blocks [2/1] [U_]
      [=======>.............]  recovery = 39.4% (3846400/9759360) finish=1.7min speed=55890K/sec

unused devices: 


[root@server1 ~]#

Wait until the synchronization has finished:

[root@server1 ~]# cat /proc/mdstat


Personalities : [raid1]
md0 : active raid1 sdb1[1] sda1[0]
      200704 blocks [2/2] [UU]

md1 : active raid1 sdb2[1] sda2[0]
      522048 blocks [2/2] [UU]

md2 : active raid1 sdb3[1] sda3[0]
      9759360 blocks [2/2] [UU]

unused devices: 
[root@server1 ~]#


Then run

# grub

and install the bootloader on both HDDs:

root (hd0,0)
setup (hd0)
root (hd1,0)
setup (hd1)
quit

That's it. You've just replaced a failed hard drive in your RAID1 array.

10 Links

No comments:

Post a Comment