nice software raid recovery howto (Linux)

home | blog | Terrible people and places | Covid-19 links | Teh Internet | guest blog |rants | placeholder | political | projects | Gwen and Liam | Citadel patched | Tools | Scouts

Linux Raid1 replace drive howto: (Nicely done falko!)

First off, don't panic! If you follow this and still loose data, I am not responsible for your lazy nature, bad things can and will happen, so backup (and verify the backup) before attempting anything mentioned here.
Next, just don't do anything you will regret later. Thinking is required. If you don't think before taking action, you will loose data!
Also, don't try any of this on GPT partition (non-dos mbr) partitioned disks. You need parted and gdisk to work on them- see sysrescuecd for help.

cat /proc/mdstat

(if nothing there, try this - mdadm --examine /dev/sdb1 (or whatever drive you know is part of the array that you still have functioning).

If you find that the array is not running try this: mdadm --assemble --scan

If you find that you have a bad disk and can't start any other way, try starting the array without the bad one:
mdadm --manage /dev/md0 --stop
(mdadm stop failed?, try the section on lvm below if that is in use)

mdadm --assemble -f /dev/md0 /dev/sdaX /dev/sdbX /dev/sdcX (and skip /dev/sddX if that is the bad egg)

cat /proc/mdstat
and see if you get the raid level and rebuild you are expecting.

After it finishes, you can reboot, mount and partition and add back in the replacement drive.

So if sdc1 is failing:

mdadm --manage /dev/md0 --fail /dev/sdc1
mdadm --manage /dev/md0 --remove /dev/sdc1

the hwinfo command is good to capture serial numbers for identification of physical drives! (should be dumped after you initially do the raid though)...

You did do that by reading my other howto so you are covered. Just pull a copy from your version control and you can see which physical drive to replace! Just search for the "Serial ID:" line for the drive you want to replace.

Alternatively, you can find /dev/disk/by-id with an ls -al to be of use (hint, shows disks, sn's and letter and partition assignments!)

(shutdown and add hardware replacment - or just add it if you have hot swap devices and hardware)

If hot swap and the rescan does not find the new device, try this:

echo "- - -" > /sys/class/scsi_host/*/scan

- Or-

echo "- - -" > /sys/class/scsi_host/host0/scan

dmesg (check the end for possible /dev/sdX changes to a different letter i.e. higher)

- Here is a trick to find out what drive lives where:

This should make the drive sda light up constant (assuming it is working).  Don't swap the if= and of= arguments unless you want to delete all the data on the drive!
dd if=/dev/sda of=/dev/null
Rinse and repeat for sdb, sdc, sdd etc to find where the drives are in the bays.

lsscsi can show the physical layout for drive and show ports, but the cable routing might not be quite what you expect (i.e. unless you wrote down the cable paths
before install).

After startup, verify that the order did not change of the drives!!!

(If you start with a semi-used disk that has GPT partition and you want to use MBR, here is the trick):
sgdisk --zap-all /dev/sdX (where X is the disk you are planning to use as the new disk)

    This will zero out the disk id as well, so you might need to go in to expert mode of fdisk
    and create it (x -> i) - there is a bug in older versions of fdisk that won't save unless you make
    a partition change, so you might just change a partition type and change it back to write the disk
    id change - probably not necessary though if you are just using a real OS like Linux.
    If using Windows on part of the disk, I have heard of license checks using the disk id...

You might also check for partial raid info and get rid of that as well:
cat /proc/mdstat (see if remaining partitions exist with raid)
- stop the array
(if you have lvm active, and need to nuke that):
vgchange -a n volumegroup
vgremove volumegroup
pvremove /dev/whatever from above
Now get rid of that pesky left over raid:
mdadm --stop /dev/md0
mdadm: stopped /dev/md0
(check to see what part of that raid you need to zero:)
mdadm --detail /dev/md0
% mdadm --zero-superblock /dev/sda1

--------------NOTE: Do not do this on GPT partitioned disks-------------------------------------
use sfdisk on the /dev/sdb and sdc to capture the partition table look like the other partitions the same on all drives on md0 - mostly the new one :-)
--------------NOTE: Do not do this on GPT partitioned disks-------------------------------------
sfdisk -d /dev/sdb > sdb.out

--------------NOTE: This is for use on GPT partitioned disks-------------------------------------

---------------not another man page change------------------
here is the latest: (this will write the partitioning scheme from sda to sdb). again don't get it wrong
and check man pages first (and second), before performing this bit:
sgdisk -R=/dev/sdb /dev/sda
sgdisk -b sdb.out /dev/sdb (check the man page before doing this as I found -b -d differences)!!!!
sgdisk --backup=./sda.out /dev/sda (works on some versions)...
(check the output to see that it is what you expected!)
--------------NOTE: Do not do this on GPT partitioned disks-------------------------------------
sfdisk /dev/sdc < sdb.out
--------------NOTE: This is for use on GPT partitioned disks-------------------------------------

Make sure you zero out the old mbr boot block before doing this or you can have hitch hikers!
-------------Actually, this copies the GUID, so don't do it with anything but the dump from the drive to be replaced!-----------
----------so, you will probably want to create a new gpt partition table to get a new GUID and do the partitions by hand!-------
sgdisk -g -l sdb.out /dev/sdc (again, don't do this without checking your man pages!)  Might be wrong!
sgdisk -g --load-backup=./sda.out /dev/sdc (might get you farther, but also might break things!)

- Ok seems newer sgdisk has this convenience:
sgdisk -R /dev/sd[newdrive] /dev/sd[copy-partition-from-drive]
-- newer sgdisk has a way to redo the guid:
sgdisk -G /dev/sdX (should get a new guid), check with gdisk - but read the man page first!

(and again, check to see that you got what you wanted!)

--------------NOTE: Do not do this on GPT partitioned disks-------------------------------------
Or, if you are daring and want to us a pipe (careful about syntax!) sfdisk -d /dev/sdb | sfdisk /dev/sdc Make sure you know what input drive is

--------------NOTE: this won't get you info on GPT partitioned disks----------------------------
fdisk -l (to compare what just happened)...

(remember, don't make the whole drive the array part, make a partition part of an array!)

(now add the new drive back to the mix)

mdadm --manage /dev/md0 --add /dev/sdc1

If you dont see it rebuilding i.e.: cat /proc/mdstat does not give you something like:
[>....................] recovery = 0.3% (784256/234372160) finish=54.5min speed=71296K/sec)
when you add the new drive back in, just remove it again,
and re-add it. Not sure why drives get added as spares sometimes...
I suppose it could be like this bit from the mdadm man page says about spares vs resync, but I have
never waited it out :-)

      When  creating a RAID5 array, mdadm will automatically create a degraded array with an extra spare drive.
       This is because building the spare into a degraded array is in general faster than resyncing  the  parity
       on a non-degraded, but not clean, array.  This feature can be overridden with the --force option.

Getting device busy? I.E: Device or resource busy

Try this:
(kick the drive out of the raid array(s)
  -- all the raid arrays with:
 mdadm --manage /dev/mdX --fail /dev/sdc1
 mdadm --manage /dev/mdX --remove /dev/sdc1

Now zero out existing raid info:
 mdadm --zero-superblock /dev/sdc1 (and other partitions if used in other raid array(s))

Re-try the addition

If still a no-go, give this a shot:
Summary: Make sure fake raid (fraid) and the dmraid module is not grabbing your drive...
Note: the dmraid might be built in to the kernel, so lsmod won't show it.

You might also try:

1.  Stop all the md raid devices - i.e.:
    mdadm --stop /dev/md1, etc...

2.  check for dmraid snagging your devices via the dmraid driver:
    dmsetup table (check output)

3.  Remove them via (careful, and make sure you know you are not using dmraid!):
    dmsetup remove_all

If you have some LVM on top, you will want to turn that down to get the underlying raid to be freed up:

- first and foremost, unmount any lvm - i.e. boot from Sysrescue CD or the like

lvchange -an <lv path>
vgchage -an <vg path>

You will do the opposite below to put it back... (-ay) see bottom of this page.

Have ghost devices from a prior raid check?
Here is another mdadm trick to just remove failed devices (hot) if the output of /proc/mdstat is showing
an incorrect number of devices (i.e. you have 8 drive bays and mdadm is showing a 9th drive):

mdadm /dev/md0 -r failed
mdadm: hot removed 8:19
mdadm /dev/md0 -r detached
blah blah blah...

Oh, and if you are mounting an lvm from this pool, don't forget:

vgchange -ay blah
lvchange -ay /dev/mapper/blahvolumegroup

One more thing... If you want to survive a raid 5 double drive failure, don't stress the drives by re-mounting and hitting them before the rebuild finishes!

Last thing - I promise! Don't forget to reinstall grub if you use gpt formatted disks:

grub-install /dev/sda
(follow up with sdb, sdc, and sdd if they are part of your fail over!)

- Warning!!! - You might have to allow the rebuild to finish for this to work if you layer LVM2 on top of everything!
I forgot this on a drive replacement when going from spinning rust to ssd drives, and when I rebooted... Well, let us just say simple things can make for long days with road trips involved.

Speed is important - rebuild speed control:
# sysctl
# sysctl
--- speed em up:
# sysctl -w