How to rebuild the hardware RAID from the operating system ? Is it possible ? Yes . Using the MegaCLI or LSIutil , we can re-build the hardware RAID for Non-OS disks from the OS. In my case , We have lost the one of the HDD which was part of RAID 0 . It contains the non-critical data and we have planned to restore the data from backup after replacing the faulty drive. RAID 0 will not rebuild automatically , since array fails for single disk failure. You need to construct the RAID 0 from the scratch after replacing the drive. Let’s assume such a scenario here .
- Operating Systems: Redhat Enterprise Linux 6.0
- Hardware: X86
1. List the available SCSI devices.
UA-RHEL-6# lsscsi [1:0:1:0] disk CISCO-UA TBE2846RC SC19 - [1:0:2:0] disk CISCO-UA TBE2846RC SC19 - [1:0:3:0] disk CISCO-UA TBE2846RC SC19 - [1:0:4:0] disk CISCO-UA TBE2846RC SC19 - [1:0:5:0] disk CISCO-UA TBE2846RC SC19 /dev/sdf [1:0:6:0] disk CISCO-UA TBE2846RC SC19 /dev/sdg [1:0:7:0] disk CISCO-UA TBE2846RC SC19 /dev/sdh [1:1:1:0] disk LSILOGIC Logical Volume 3000 /dev/sdj [1:1:3:0] disk LSILOGIC Logical Volume 3000 /dev/sdi UA-RHEL-6#
In the OS level , we can see that /dev/sdi failed and the filesystem was showing I/O error . Form the hardware console ,we can see that one of the HDD has been failed which was part of RAID 0. We have opened the vendor case to replace the HDD. The hardware vendors suggested to rebuild the RAID 0 array which has failed due to the disk failure.
Remove the failed device from the OS device tree using “echo 1 > /sys/block/sdi/device/delete” . Here /dev/sdi will be removed from the system.
2. Verify the volume status. Use the “lsiutil” to check the volume status.
UA-RHEL-6# lsiutil
LSI Logic MPT Configuration Utility, Version 1.60, July 21, 2010
1 MPT Ports found
Port Name Chip Vendor/Type/Rev MPT Rev Firmware Rev IOC
1. /proc/mpt/ioc1 LSI Logic SAS1086E B3 105 011e0a00 0 --> Select the controller by selecting "1".
Select a device: [1 or 0 to quit] 1
1. Identify firmware, BIOS, and/or FCode
2. Download firmware (update the FLASH)
4. Download/erase BIOS and/or FCode (update the FLASH)
8. Scan for devices
10. Change IOC settings (interrupt coalescing)
13. Change SAS IO Unit settings
16. Display attached devices
20. Diagnostics
21. RAID actions --------------> Select the RAID actions by selecting "21".
22. Reset bus
23. Reset target
42. Display operating system names for devices
45. Concatenate SAS firmware and NVDATA files
59. Dump PCI config space
60. Show non-default settings
61. Restore default settings
66. Show SAS discovery errors
69. Show board manufacturing information
97. Reset SAS link, HARD RESET
98. Reset SAS link
99. Reset port
e Enable expert mode in menus
p Enable paged mode
w Enable logging
Main menu, select an option: [1-99 or e/p/w or 0 to quit] 21
1. Show volumes ------------------------------> Check the volume status by selecting "1"
2. Show physical disks
3. Get volume state
4. Wait for volume resync to complete
23. Replace physical disk
26. Disable drive firmware update mode
27. Enable drive firmware update mode
30. Create volume
31. Delete volume
32. Change volume settings
33. Change volume name
50. Create hot spare
51. Delete hot spare
99. Reset port
e Enable expert mode in menus
p Enable paged mode
w Enable logging
RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit] 1
1 volumes are active, 2 physical disks are active
Volume 0 is Bus 0 Target 3, Type IS (Integrated Striping)
Volume Name:
Volume WWID: 0ed4cb5783a5b6dg
Volume State: failed, enabled
Volume Settings: write caching disabled, auto configure
Volume draws from Hot Spare Pools: 0
Volume Size 418164 MB, Stripe Size 64 KB, 3 Members
Member 0 is PhysDisk 4 (Bus 0 Target 9)
Member 1 is PhysDisk 3 (Bus 0 Target 11)
Member 2 is PhysDisk 0 ( - )
RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit] ------> To Quit , enter "0" .
Here we can see that ” volume 0 ” is in failed state. Physical Disk 0 is missing and it needs to replace by the hardware vendor.
Once they replace the disk , the volume state will remain same but you will be able to see the newly inserted disk like below.
RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit] 1 1 volumes is active, 3 physical disks are active Volume 0 is Bus 0 Target 3, Type IS (Integrated Striping) Volume Name: Volume WWID: 0ed4cb5783a5b6dg Volume State: failed, enabled Volume Settings: write caching disabled, auto configure Volume draws from Hot Spare Pools: 0 Volume Size 418164 MB, Stripe Size 64 KB, 3 Members Member 0 is PhysDisk 4 (Bus 0 Target 9) Member 1 is PhysDisk 3 (Bus 0 Target 11) Member 2 is PhysDisk 0 (Bus 0 Target 5) RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit]
In the above output, we can see that , system is able to see the “Target 5” which was not available prior to the disk replacement. At this point all the three disks are available for the RAID 0 but array is in failed state. So let me delete the failed array.
3. Delete the failed array . From the lsiutil – > RAID Actions – >
RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit] 1. Show volumes 2. Show physical disks 3. Get volume state 4. Wait for volume resync to complete 23. Replace physical disk 26. Disable drive firmware update mode 27. Enable drive firmware update mode 30. Create volume 31. Delete volume --------------------------------------> Delete the volume . 32. Change volume settings 33. Change volume name 50. Create hot spare 51. Delete hot spare 99. Reset port e Enable expert mode in menus p Enable paged mode w Enable logging RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit] 31 Volume 0 is Bus 0 Target 3, Type IS (Integrated Striping) Volume: [0-1 or RETURN to quit] 0 All data on Volume 0 will be lost! Are you sure you want to continue? [Yes or No, default is No] yes Zero the first block of all volume members? [Yes or No, default is No] Volume 0 is being deleted RAID ACTION returned IOCLogInfo = 00000001 RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit]
4. Re-create the new RAID 0 array with three physical disk.
RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit]
1. Show volumes
2. Show physical disks
3. Get volume state
4. Wait for volume resync to complete
23. Replace physical disk
26. Disable drive firmware update mode
27. Enable drive firmware update mode
30. Create volume
31. Delete volume
32. Change volume settings
33. Change volume name
50. Create hot spare
51. Delete hot spare
99. Reset port
e Enable expert mode in menus
p Enable paged mode
w Enable logging
RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit] 30
B___T___L Type Vendor Product Rev Disk Blocks Disk MB
1. 0 3 0 Disk CISCO-UA TBE2846RC SC16 286748000 140013
2. 0 5 0 Disk CISCO-UA TBE2846RC SC16 286748000 140013
3. 0 6 0 Disk CISCO-UA TBE2846RC SC16 286748000 140013
4. 0 7 0 Disk CISCO-UA TBE2846RC SC16 286748000 140013
5. 0 8 0 Disk CISCO-UA TBE2846RC SC16 286748000 140013
6. 0 11 0 Disk CISCO-UA TBE2846RC-89 B63K 286748000 140013
To create a volume, select 1 or more of the available targets
select 3 to 10 targets for a mirrored volume
select 1 to 10 targets for a striped volume
Select a target: [1-6 or RETURN to quit] 3 -------- > Enter the first disk Number
Select a target: [1-6 or RETURN to quit] 4 -------- > Enter the second disk Number
Select a target: [1-6 or RETURN to quit] 6 -------- > Enter the Third disk Number
Select a target: [1-6 or RETURN to quit] ---------> Just Press Enter
3 physical disks were created
Select volume type: [0=Mirroring, 1=Striping, default is 0] 1
Select volume size: [1 to 418164 MB, default is 418164]
A stripe size of 64 KB will be used
Enable write caching: [Yes or No, default is No]
Zero the first and last blocks of the volume? [Yes or No, default is No]
Skip initial volume resync? [Yes or No, default is No]
Volume was created
RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit]
5. Check the volume status.
1. Show volumes 2. Show physical disks 3. Get volume state 4. Wait for volume resync to complete 23. Replace physical disk 26. Disable drive firmware update mode 27. Enable drive firmware update mode 30. Create volume 31. Delete volume 32. Change volume settings 33. Change volume name 50. Create hot spare 51. Delete hot spare 99. Reset port e Enable expert mode in menus p Enable paged mode w Enable logging RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit] 1 1 volume is active, 3 physical disks are active Volume 1 is Bus 0 Target 6, Type IS (Integrated Striping) Volume Name: Volume WWID: 08703748d1be5b0e Volume State: optimal, enabled Volume Settings: write caching disabled, auto configure Volume Size 418164 MB, Stripe Size 64 KB, 3 Members Member 0 is PhysDisk 0 (Bus 0 Target 9) Member 1 is PhysDisk 3 (Bus 0 Target 7) Member 2 is PhysDisk 4 (Bus 0 Target 11) RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit] 0 Main menu, select an option: [1-99 or e/p/w or 0 to quit] 0
We have successfully re-created the RAID 0 volume from lsiutil .
6. Check the new logical RAID 0 volume using lsscsi.
UA-RHEL-6# lsscsi [1:0:1:0] disk CISCO-UA TBE2846RC SC19 - [1:0:2:0] disk CISCO-UA TBE2846RC SC19 - [1:0:3:0] disk CISCO-UA TBE2846RC SC19 - [1:0:4:0] disk CISCO-UA TBE2846RC SC19 - [1:0:5:0] disk CISCO-UA TBE2846RC SC19 /dev/sdi [1:0:6:0] disk CISCO-UA TBE2846RC SC19 /dev/sdg [1:0:7:0] disk CISCO-UA TBE2846RC SC19 /dev/sdh [1:1:1:0] disk LSILOGIC Logical Volume 3000 /dev/sdj [1:1:3:0] disk LSILOGIC Logical Volume 3000 /dev/sdf UA-RHEL-6#
/dev/sdi has been replaced successfully by /dev/sdf. LSILOGIC Logical Volume – Represents the RAID logical volume. Create the new filesystem and restore the data from backup which you have lost.
Hope article is informative to you . Thank you for visiting UnixArena.
Leave a Reply