Saturday, June 17, 2006

RAIDframe Tricks

OpenBSDLife is funny sometimes. On Wednesday, a friend asked if I had experience with RAIDframe, the software RAID system in OpenBSD. He had a failed disk and wanted to talk about how to get RAIDframe to use the new drive as a main component vs. a spare. Not two days later, I was planning to upgrade my last OpenBSD 3.8 server (at home) and noticed that my raid0 had a failed component.

My RAID sets are each 90 GB mirrors across 3 disks. I know, I know: 3 disks? But one disk is a 250GB drive and the other two are 90 GB drives. Each 90 GB drive is mirrored to a separate 90 GB partition on the 250 GB drive. The data to end up on the mirror sets actually lived on the 90 GB drives. Each 90 GB drive was over half full (so I couldn't copy ALL of the data one drive while setting up the RAID on the other). So, I used a method described in the man page to set up a broken RAID set to get all of my data on my new drives.

So before I started to set up my mirrors, /, /usr, /tmp and /var all lived on partitions on wd0 (my 250 GB drive). /home was a regular 4.2BSD partition on wd1 that took most of the disk. /software was a regular 4.2BSD partition on wd2 that took most of the disk.

My first step was to create new 90 GB partitions on wd0 (wd0g & wd0h). I formatted wd0g as 4.2BSD and copied all of /software to it. Then I configured wd0h and wd2a as RAID partition types. I created my /etc/raid1.conf file:
START array
# numRow numCol numSpare
1 2 0

START disks
/dev/wd0h
/dev/wd2a

START layout
# sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level
128 1 1 1

START queue
fifo 100


After raid1 was created, I mounted raid1i as /software and copied the data from wd0g to it. But now I had a problem. Where do I put the data from /home while I create a mirror with wd0g and wd1a? I didn't have enought disk space. So, I created a raid0.conf that looked like this:
START array
# numRow numCol numSpare
1 2 0

START disks
/dev/wd0g
/dev/wd3a

START layout
# sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level
128 1 1 1

START queue
fifo 100


Then I changed the paritition type of wd0g to RAID and created my raid0 with wd0g and a component I actually don't have in my system: wd3a. The RAID set configures properly, but "broken". This method is described in the raidctl(8) manpage. Search for "Under certain circumstances".

So, after raid0 was created, I copied /home (wd1a) to my new RAID partition and mounted raid0i as /home. This is where I messed up. I forgot to finish and add wd1a to the raidset. How could I have done that? I'm very lucky that I didn't have any problems with wd0 or I would have lost much of my data.

Yesterday, after upgrading to OpenBSD 3.9, I fixed raid0. I changed the partition type of wd1a to RAID. Then I added it as a hot spare (raidctl -a /dev/wd1a raid0). Since wd3a doesn't exist on my system, when I rebooted, it showed up as component1: failed. So, I failed the non-existent "failed" component and started regeneration to the hot spare with "raidctl -F component1 raid0". When I was done with that step, "raidctl -s raid0" showed:
Components:
/dev/wd0g: optimal
component1: spared
Spares:
/dev/wd1a: used_spare


The last step here was to reconfigure raid0 with the proper raid0.conf (I changed wd3a to wd1a in the /etc/raid0.conf). I unmounted /home, ran "raidctl -u raid0", ran "raidctl -c /etc/raid0.conf" and remounted /home.

Now my raidsets are working properly (what a relief).

2 comments:

  1. Mike,

    Not sure if you can help me or if this is appropriate, but at my wits end.

    Trying to get my OpenBSD 4.2 box RAID0 configuration setup. However, much to my chagrin, I can't seem to get the /dev/wd1b device to optimize.

    I have raidctl -a /dev/wd1b raid0
    I have raidctl -vF component0 raid0

    And below is the reslult afterwards:

    # raidctl -s raid0
    raid0 Components:
    component0: spared
    /dev/wd0b: optimal
    Spares:
    /dev/wd1b: used_spare
    Parity status: clean
    Reconstruction is 100% complete.
    Parity Re-write is 100% complete.
    Copyback is 100% complete.


    I reboot ... then get back to:

    component0 failed
    No Spares.

    Any ideas ..?

    # cat /root/raid0.conf
    ---Begin---

    START array
    1 2 0

    START disks
    /dev/wd0b
    /dev/wd1b

    START layout
    128 1 1 1

    START queue
    fifo 100

    ---End---
    #


    Thanks,
    John Dworske

    ReplyDelete
  2. John,

    I'm going to take a shot in the dark and say that usually OpenBSD reserves the b slices for swap. Try using a different slice (like d) and see if that works for you.

    ReplyDelete