RAID Array includes a missing member and cannot rebuild

My “old” Adaptec 2100s RAID controller had a SCSI drive fail over the weekend.  I bought a new drive, but couldn’t get it to rebuild.  After searching Adaptec’s Knowlegebase, I found the solution –

In order to rebuild the array, the newly added drive must be assigned as a hot spare.

In the controller BIOS (SMOR), highlight the RAID controller listing beneath Configuration / Local and press Enter. Locate the new hard drive listed beneath the controller and choose “Action” – “Make Hotspare”. Select ‘file’ then ’set system config’ to initiate a rebuild to the newly added hot spare. Exit the utility and reboot the system. The new drive should take the place of the Missing Component and the array should rebuild.

If the array does not rebuild, ensure that the newly added drive has the same exact or larger capacity than any other drives currently in the array.

Client | cygwin-rsyncd Reinstall

rsyncd quit running as a service.  I decided to reinstall it using the service.bat batch file.  Since rsyncd was already listed as a service, the install failed.  Therefore, you need to delete the rsyncd service using the following command:

  • c:\>sc delete rsyncd
  • Reboot the computer

After the reboot, run the service.bat file included in cygwin-rsyncd.zip download.

Creating the Array Using mdadm

[root@backup ~]# mdadm -C -l 5 -n 4 /dev/md0 /dev/etherd/e1.[0-3]

Create /dev/md0 as a RAID5 array consisting of /dev/etherd/e1.0,  /dev/etherd/e1.1, /dev/etherd/e1.2, and /dev/etherd/e1.3.

Examining this command with options (per man mdadm):

-C or –create, Create a new array

-l 5 or –level 5,  Set raid level 5.

When used with –create, options are: linear, raid0, 0, stripe, raid1, 1, mirror, raid4, 4, raid5, 5, raid6, 6, raid10, 10, multipath, mp, faulty. Obviously some of these are synonymous.

When used with –build, only linear, stripe, raid0, 0, raid1, multipath, mp, and faulty are valid.

Not yet supported with –grow.

-n 4 or –raid devices 4,  Sets the number of active devices in the array.

This, plus the number of spare devices (see below) must equal the number of component-devices (including “missing” devices) that are listed on the command line for –create. Setting a value of 1 is probably a mistake and so requires that –force be specified first. A value of 1 will then be allowed for linear, multipath, raid0 and raid1. It is never allowed for raid4 or raid5.
This number can only be changed using –grow for RAID1 arrays, and only on kernels which provide necessary support.

New Drive Setup in Coraid Rack

Drives need initialization before use.  Use minicom to access the Coriad via serial:

[root@backup ~]# minicom

Welcome to minicom 2.00.0

OPTIONS: History Buffer, F-key Macros, Search History Buffer, I18n
Compiled on Feb 21 2005, 19:32:30.

Press CTRL-A Z for help on special keys

Issue the Coraid show command in the minicom terminal window:

SATA shelf 1> show -l
1.0 400.088GB up
1.1 400.088GB up
1.2 400.088GB up
1.3 400.088GB up

Issue the jbod Coraid command in the minicom terminal window:

SATA shelf 1> jbod 1.0-3
making 0
making 1
making 2
making 3

Issue the list Coraid command to verify the jbod creation:

SATA shelf 1> list -l
0 400.088GB online
0.0 400.088GB raidl
0.0.0 normal 400.088GB 1.0
1 400.088GB online
1.0 400.088GB raidl
1.0.0 normal 400.088GB 1.1
2 400.088GB online
2.0 400.088GB raidl
2.0.0 normal 400.088GB 1.2
3 400.088GB online
3.0 400.088GB raidl
3.0.0 normal 400.088GB 1.3

Exit the Coraid shelf and minicom (CTRL-A Z X)

[root@backup ~]# modprobe aoe
[root@backup ~]# aoe-stat
e1.0 400.088GB eth0 up
e1.1 400.088GB eth0 up
e1.2 400.088GB eth0 up
e1.3 400.088GB eth0 up

Create the array

[root@backup ~]# mdadm -C -l 5 -n 4 /dev/md0 /dev/etherd/e1.[0-3]
mdadm: array /dev/md0 started.

Create the volume

[root@backup ~]# pvcreate /dev/md0
Physical volume “/dev/md0″ successfully created


Create the pool

[root@backup ~]# vgcreate pool0 /dev/md0
Volume group “pool0″ successfully created

Verify the pool

[root@backup ~]# vgdisplay pool0
— Volume group —
VG Name pool0
System ID
Format lvm2
Metadata Areas 1
Metadata Sequence No 1
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 0
Open LV 0
Max PV 0
Cur PV 1
Act PV 1
VG Size 1.09 TB
PE Size 4.00 MB
Total PE 286165
Alloc PE / Size 0 / 0
Free PE / Size 286165 / 1.09 TB
VG UUID f1emjX-S33L-Y3Av-7Xlr-YSgM-Dc4u-h4t0Xw

Create the logical volume

[root@backup ~]# lvcreate -l 286165 -n vol1 pool0
Logical volume “vol1″ created

Format the volume

[root@backup ~]# mkfs -t ext3 /dev/pool0/vol1
mke2fs 1.35 (28-Feb-2004)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
146522112 inodes, 293032960 blocks
14651648 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=293601280
8943 block groups
32768 blocks per group, 32768 fragments per group
16384 inodes per group
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
102400000, 214990848

Writing inode tables: done
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 24 mounts or
180 days, whichever comes first. Use tune2fs -c or -i to override.
[root@backup ~]#

Mount the volume

[root@backup ~]# mount /dev/pool0/vol1 /mnt/vol1

Rsync the backuppc config files

rsync -av -v /home/config/ /home/backuppc/

Note: If you get the following error: rsync: mkdir “/home/backuppc” failed: File exists (17), recreate the link from /home/backuppc to /mnt/vol1/backuppc.

Start BackupPC

[root@backup home]# /mnt/vol1/backuppc/backuppc start
Starting BackupPC: [ OK ]

Using a Disk from Another Decommissioned RAID Array

Before you can reuse a disk from another RAID array, the disk’s superblock must be zeroed per the following command:

# mdadm --zero-superblock /dev/hdc1 --force
Note:  The equivalent to the superblock on Microsoft Windows filesystem is the file allocation
 table (FAT), which records which disk blocks hold the topmost directory.

Reloading the Hosts File

# /mnt/vol1/backuppc/backuppc reload

Installation | init.d Script

Per the README located in the extraction directory /root/backuppc/BackupPC-2.1.2/init.d, I did the following:

RedHat Linux:
============

When configure.pl is run, the script linux-backuppc is created. It
should be copied to /etc/init.d/backuppc:

cp linux-backuppc /etc/init.d/backuppc

After copying it, you can test it by running these commands as root:

/etc/init.d/backuppc start
/etc/init.d/backuppc status
/etc/init.d/backuppc stop

You should then run the following commands as root:

chkconfig –add backuppc
chkconfig –level 345 backuppc on
chkconfig –list backuppc

This will auto-start backuppc at run levels 3, 4 and 5.

Client | cygwin-rsyncd

Install the cygwin-rsyncd on WinXX clients. I tried using SMB, but it didn’t work. rsyncd works better per the BackupPC FAQ.

Windows XP Firewall Issue
Make sure that the Windows XP firewall, if turned on, is allowing connections through on port 873 (rsync).

Edit rsync.conf on the workstation

[CAM]
#
# Exact DOS style path to the file or directory to be rsync accessible
#
path = e:/CAM

#
# A short description of the module. This is what is printed when
# using rsync to “browse” the server for what modules are available.
#
comment = CAM Data

Make sure that the [Module Name] matches this client’s config.pl $Conf{RsyncShareName}. Otherwise the backup will fail.

CGI Authentication | AuthUserFile

In the Apache http.conf file, the backuppc directory has included the AuthUserFile directive. This allows for user / password authentication to the BackupPC CGI application. In order to add a user for access, issue the following command as root in /var/www/cgi-bin:

htpasswd -c .backuppcpsswd username

You’ll be prompted for a password. This password is encrypted in the .backuppcpsswd file.

Installation | http.conf

Need to edit Apache http.conf located at /etc/httpd/conf

# The following was added for BackupPC, JCSUOMI 05/01/2006 @ 12:31PM

user backuppc
group backuppc
ServerName backup

<Directory /var/www/cgi-bin/>
SetHandler perl-script
PerlResponseHandler ModPerl::Registry
PerlOptions +ParseHeaders
Options +ExecCGI
Order deny,allow
Deny from all
Allow from 172.16.0 127.0.0
AllowOverride Indexes AuthConfig
AuthName “Backup Admin”
AuthType Basic
AuthUserFile /var/www/cgi-bin/.backuppcpsswd
Require valid-user
</Directory>

Note: I had problems with an Apache error stating that ExecCGI wasn’t enabled. I found that I needed to search for “nested” options in the http.conf file. I # out the Options and the problem was resolved.

ScriptAlias /cgi-bin/ “/var/www/cgi-bin/”

#
# “/var/www/cgi-bin” should be changed to whatever your ScriptAliased
# CGI directory exists, if you have that configured.
#
<Directory “/var/www/cgi-bin”>
AllowOverride None
# Disabled the following line for backuppc. JCSUOMI 05/01/2006
# Options None
Order allow,deny
Allow from all
</Directory>

Restart apache using the following command:

apachectl restart