Troubleshooting RAID 1 in Solstice DiskSuite Software

I found this on my notes folder, compile by my former colleague during the day when we support SUN Servers.

Database Replica Errors

Problem: State database is corrupted or unavailable
Cause: Disk failure, disk I/O error
Symptom: Error message at boot time if databases are <= 50% of total database. System comes into single-user mode.

Suggested steps to follow:

1. At the ok prompt, issue the boot command. The system will enter single-user mode because of the broken database replicas.

ok > boot
...
Hostname: host1
metainit: host1: stale databases
Insufficient metadevice database replicas located.
Use metadb to delete databases which are broken.
Ignore any "Read-only file system" error messages.
Reboot the system when finished to reload the metadevice
database.
After reboot, repair any broken database replicas which were
deleted.
Type Ctrl-d to proceed with normal startup,
(or give root password for system maintenance): 
Entering System Maintenance Mode.

2. Use the metadb command to look at the metadevice state database. You can see which state database replicas are not available -- they are marked by "unknown" and the M flag.

# metadb -i
flags      first blk      block count
a m  p lu    16           1034                /dev/dsk/c0t0d0s7
a    p  l    1050         1034                /dev/dsk/c0t0d0s7
M    p       unknown      unknown             /dev/dsk/c0t1d0s7
M    p       unknown      unknown

3. Delete the state database replicas on the bad disk using the -d option. At this point, the root (/) file system is read-only. You can ignore the mddb.cf error messages:

# metadb  -d  -f  c0t1d0s7
metadb: demo: /etc/opt/SUNWmd/mddb.cf.new: Read-only file system.

Verify deletion:

# metadb  -i
flags        first blk       block count
a m  p  lu   16              1034            /dev/dsk/c0t0d0s7
a    p  l    1050            1034            /dev/dsk/c0t0d0s7

4. Reboot the system.

5. Use the metadb command to add back the state database replicas and verify that these replicas are correct.

# metadb -a -c 2 c0t1d0s7
# metadb -i
flags        first blk  block count
a m  p  luo  16           1034         dev/dsk/c0t0d0s7
a    p  luo  1050         1034         dev/dsk/c0t0d0s7
a       u    16           1034         dev/dsk/c0t1d0s7
a       u    1050         1034         dev/dsk/c0t1d0s7

Metadevice Errors

Problem: Sub-mirrors are out of sync in "Needs maintenance" state
Cause: Disk problem or failure, improper shutdown, communication problems between two mirrored disks
Symptom: "Needs maintenance" errors in metastat output

Suggested steps to follow:

1. Replace the faulty disk.

2. Create a partition that is the same as the original disk. If you need to recover the state database, follow the above steps.

3. Log in to the Solaris OS and issue the metastat command. You will see the results as shown below:

# metastat

 d0: Mirror
   Submirror 0: d10
     State: Needs maintenance
   Submirror 1: d20
     State: Okay
...
d10: Submirror of d0
   State: Needs maintenance
   Invoke: "metareplace d0 /dev/dsk/c0t3d0s0 "
   Size: 47628 blocks
   Stripe 0:
Device              Start Block  Dbase State        Hot Spare
/dev/dsk/c0t3d0s0          0     No    Maintenance

d20: Submirror of d0
   State: Okay
   Size: 47628 blocks
   Stripe 0:
Device              Start Block  Dbase State        Hot Spare
/dev/dsk/c0t2d0s0          0     No    Okay

4. The result shows that the disk c0t3d0s0 was faulty and replaced. Use the metareplace command to enable the device:

# metareplace -e d0 c0t3d0s0
Device /dev/dsk/c0t3d0s0 is enabled

Or if you want to move the faulty device to a new disk with a different target, you can use this command:

# metareplace  d0 c0t3d0s0

sQew notepad ++ Unix | Storage | Backup

Troubleshooting RAID 1 in Solstice DiskSuite Software

0 comments:

About Me

Categories

GTalk

Blog Archive

Total Pageviews