More linux software raid fun
OK, I know there are probably quite a few looking at this thinking…. “Hello where has he been?” when I’m talking about playing around with software raid under ubuntu linux, but…. in spite of all I’ve done supporting desktop systems and even small business servers I’ve never had an opportunity to setup a raid array. First it was because “you don’t want to do software raid” and I didn’t have a card to support hardware raid, then when I first was seeing people raving about how good linux software raid is, I didn’t have two free drives of equal size to work with. Well, the other day in working towards a storage system for a client I’ve got an ubuntu system (dapper 6.06.1) setup with software raid and wanted to poke and prod and test some things out before it goes into a useful role.
After all, if a drive fails I want to have an inkling of how I’m going to replace it, etc. I’m using the mdadm toolset for the software raid and I’ve got to say it’s impressive. Let’s see the system (Duron (~1200 I think)) performs fairly well (512MB of memory) with the dual hard drive setup (2 -250GB drives) I detailed the partitioning previously … 10MB /boot 6GB swap 40 GB / 100 GB /home 100GB /var…. the choices there are debatably too big, but I want room for growth especially in / and /var may see some vmware image at some point. 6GB of swap is probably too big, but I’ve had systems that had too LITTLE swap and would rather not experience that again. (Manually creating a loop file to swap into….)
Yes, EVERYTHING is duplicated in a RAID1 setup across the 2-250 GB drives, so even /boot and swap are mirrored. This may be less than ideal for performance (raid1 has to do two writes for every file save, but reads can get streamlined…) The goal is to have a duplicate drive so that if 1 fails, the other can keep churning along until a replacement is in.
For starters, the way I set this up you can make do with unequal size drives. In fact right now, I’m mirroring one onto a 160GB drive (minus the /var partition.) Each partition can have it’s own md device (multidisk) and mirroring can be carved out in interesting ways. Well, among the things I wanted to test were a live fail of one drive and it did remarkably well, I disconnected the data cable (on the running system) and the system didn’t miss a beat. I will suggest that no one tries this at home with IDE. On replugging, the drive died. From what I’ve read IDE technically doesn’t support hotplugging. In fact I’ve seen hotplugging done (and done it myself in testbench situations.) Technically, reconnecting the power rail and then the data cable should do no harm, in reality in this case, the drive refused to spin back up. I spent the next hour shutting down and restarting connecting the drive to various power sources to see if it would spin again. Let me repeat – DON’T TRY this unless your prepared to lose a drive. For the purposes of the test though…. this went VERY well, the system didn’t miss a bit, mdadm noticed that the drive had “failed” and I rebooted several times with no event.
Then I started wondering, could you add disks to an existing array? (Turn a 2 disk array into 3, or more specifically install a 1 disc “array” which doesn’t make much sense and then add a 2nd…) The answer is yes. I did a vmware environment for this with the ubuntu alternate install disk (dapper), gave just one drive initially, setup software raid with 1 active device. It warned me that I wouldn’t be able to change the # of active devices, but post install, I was able to mdadm –grow /dev/md0 -n 2 and then activate the device I had by then created to take the second place. So, you could install with the intent of moving to raid sometime and then grow your array when the opportunity came (at least raid1 or raid5 can grow.)
I have had other thoughts though. For instance, there is hardware that supports hotplugging of drives for offsite backups of a spare drive…. what if you used a USB 2.0 drive in the array, you can do this as well. The software raid array doesn’t care what KIND of devices you’ve added. Now, the machine I tested only (currently) has a builtin USB 1.1 plug and the sync of the 40GB partition was estimated at 702 or so minutes…. almost 1Mb per second….. USB 2.2 should be ~40 x faster which may come close to saturating the data rate for the hard drive. I’ve resorted to doing the initial sync as a true IDE device and then test what happens if it’s added back in via usb. (Do we go through a long process or is it going to quickly sync up the differences.)
Yes, a lot of these questions I probably could have read up on, but experience is usually the best teacher…. The only big frustration I’ve run into is getting the partitions the EXACT same size from one drive to the next. I’ve resorted to using fdisk start and end cylinder information and had to start from the very beginning of the drives. Life will be much simpler if you have identical size drives and partition them in exactly the same manner from the outset. Here are a few more links on the subject…..
Someone that MOVED TO software raid from hardware raid. archlinux wiki page on software raid (and lvm).
Bottom line, software raid is impressively good and easy to get started with (at least with the ubuntu installer, haven’t yet tested with the mandriva installer or others.) It is a shade more involved than a 3-5 click install, but the data redundancy should be worth it. If you’ve never given a try to software raid under linux, it might be worth a look. (You might even experiment in a vmware server virtual machine first to just see how things behave.) The only warning there is that I don’t know if vmware server does a good job of replicating the quirks of two ide devices on the same chain.