How to rebuild the hardware RAID from the operating system ? Is it possible ? Yes . Using the MegaCLI or LSIutil , we can re-build the hardware RAID for Non-OS disks from the OS. In my case , We have lost the one of the HDD which was part of RAID 0 . It contains the non-critical data and we have planned to restore the data from backup after replacing the faulty drive. RAID 0 will not rebuild automatically , since array fails for single disk failure. You need to construct the RAID 0 from the scratch after replacing the drive. Let’s assume such a scenario here .
- Operating Systems: Redhat Enterprise Linux 6.0
- Hardware: X86
1. List the available SCSI devices.
UA-RHEL-6# lsscsi [1:0:1:0] disk CISCO-UA TBE2846RC SC19 - [1:0:2:0] disk CISCO-UA TBE2846RC SC19 - [1:0:3:0] disk CISCO-UA TBE2846RC SC19 - [1:0:4:0] disk CISCO-UA TBE2846RC SC19 - [1:0:5:0] disk CISCO-UA TBE2846RC SC19 /dev/sdf [1:0:6:0] disk CISCO-UA TBE2846RC SC19 /dev/sdg [1:0:7:0] disk CISCO-UA TBE2846RC SC19 /dev/sdh [1:1:1:0] disk LSILOGIC Logical Volume 3000 /dev/sdj [1:1:3:0] disk LSILOGIC Logical Volume 3000 /dev/sdi UA-RHEL-6#
In the OS level , we can see that /dev/sdi failed and the filesystem was showing I/O error . Form the hardware console ,we can see that one of the HDD has been failed which was part of RAID 0. We have opened the vendor case to replace the HDD. The hardware vendors suggested to rebuild the RAID 0 array which has failed due to the disk failure.
Remove the failed device from the OS device tree using “echo 1 > /sys/block/sdi/device/delete” . Here /dev/sdi will be removed from the system.
2. Verify the volume status. Use the “lsiutil” to check the volume status.
UA-RHEL-6# lsiutil LSI Logic MPT Configuration Utility, Version 1.60, July 21, 2010 1 MPT Ports found Port Name Chip Vendor/Type/Rev MPT Rev Firmware Rev IOC 1. /proc/mpt/ioc1 LSI Logic SAS1086E B3 105 011e0a00 0 --> Select the controller by selecting "1". Select a device: [1 or 0 to quit] 1 1. Identify firmware, BIOS, and/or FCode 2. Download firmware (update the FLASH) 4. Download/erase BIOS and/or FCode (update the FLASH) 8. Scan for devices 10. Change IOC settings (interrupt coalescing) 13. Change SAS IO Unit settings 16. Display attached devices 20. Diagnostics 21. RAID actions --------------> Select the RAID actions by selecting "21". 22. Reset bus 23. Reset target 42. Display operating system names for devices 45. Concatenate SAS firmware and NVDATA files 59. Dump PCI config space 60. Show non-default settings 61. Restore default settings 66. Show SAS discovery errors 69. Show board manufacturing information 97. Reset SAS link, HARD RESET 98. Reset SAS link 99. Reset port e Enable expert mode in menus p Enable paged mode w Enable logging Main menu, select an option: [1-99 or e/p/w or 0 to quit] 21 1. Show volumes ------------------------------> Check the volume status by selecting "1" 2. Show physical disks 3. Get volume state 4. Wait for volume resync to complete 23. Replace physical disk 26. Disable drive firmware update mode 27. Enable drive firmware update mode 30. Create volume 31. Delete volume 32. Change volume settings 33. Change volume name 50. Create hot spare 51. Delete hot spare 99. Reset port e Enable expert mode in menus p Enable paged mode w Enable logging RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit] 1 1 volumes are active, 2 physical disks are active Volume 0 is Bus 0 Target 3, Type IS (Integrated Striping) Volume Name: Volume WWID: 0ed4cb5783a5b6dg Volume State: failed, enabled Volume Settings: write caching disabled, auto configure Volume draws from Hot Spare Pools: 0 Volume Size 418164 MB, Stripe Size 64 KB, 3 Members Member 0 is PhysDisk 4 (Bus 0 Target 9) Member 1 is PhysDisk 3 (Bus 0 Target 11) Member 2 is PhysDisk 0 ( - ) RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit] ------> To Quit , enter "0" .
Here we can see that ” volume 0 ” is in failed state. Physical Disk 0 is missing and it needs to replace by the hardware vendor.
Once they replace the disk , the volume state will remain same but you will be able to see the newly inserted disk like below.
RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit] 1 1 volumes is active, 3 physical disks are active Volume 0 is Bus 0 Target 3, Type IS (Integrated Striping) Volume Name: Volume WWID: 0ed4cb5783a5b6dg Volume State: failed, enabled Volume Settings: write caching disabled, auto configure Volume draws from Hot Spare Pools: 0 Volume Size 418164 MB, Stripe Size 64 KB, 3 Members Member 0 is PhysDisk 4 (Bus 0 Target 9) Member 1 is PhysDisk 3 (Bus 0 Target 11) Member 2 is PhysDisk 0 (Bus 0 Target 5) RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit]
In the above output, we can see that , system is able to see the “Target 5” which was not available prior to the disk replacement. At this point all the three disks are available for the RAID 0 but array is in failed state. So let me delete the failed array.
3. Delete the failed array . From the lsiutil – > RAID Actions – >
RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit] 1. Show volumes 2. Show physical disks 3. Get volume state 4. Wait for volume resync to complete 23. Replace physical disk 26. Disable drive firmware update mode 27. Enable drive firmware update mode 30. Create volume 31. Delete volume --------------------------------------> Delete the volume . 32. Change volume settings 33. Change volume name 50. Create hot spare 51. Delete hot spare 99. Reset port e Enable expert mode in menus p Enable paged mode w Enable logging RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit] 31 Volume 0 is Bus 0 Target 3, Type IS (Integrated Striping) Volume: [0-1 or RETURN to quit] 0 All data on Volume 0 will be lost! Are you sure you want to continue? [Yes or No, default is No] yes Zero the first block of all volume members? [Yes or No, default is No] Volume 0 is being deleted RAID ACTION returned IOCLogInfo = 00000001 RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit]
4. Re-create the new RAID 0 array with three physical disk.
RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit] 1. Show volumes 2. Show physical disks 3. Get volume state 4. Wait for volume resync to complete 23. Replace physical disk 26. Disable drive firmware update mode 27. Enable drive firmware update mode 30. Create volume 31. Delete volume 32. Change volume settings 33. Change volume name 50. Create hot spare 51. Delete hot spare 99. Reset port e Enable expert mode in menus p Enable paged mode w Enable logging RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit] 30 B___T___L Type Vendor Product Rev Disk Blocks Disk MB 1. 0 3 0 Disk CISCO-UA TBE2846RC SC16 286748000 140013 2. 0 5 0 Disk CISCO-UA TBE2846RC SC16 286748000 140013 3. 0 6 0 Disk CISCO-UA TBE2846RC SC16 286748000 140013 4. 0 7 0 Disk CISCO-UA TBE2846RC SC16 286748000 140013 5. 0 8 0 Disk CISCO-UA TBE2846RC SC16 286748000 140013 6. 0 11 0 Disk CISCO-UA TBE2846RC-89 B63K 286748000 140013 To create a volume, select 1 or more of the available targets select 3 to 10 targets for a mirrored volume select 1 to 10 targets for a striped volume Select a target: [1-6 or RETURN to quit] 3 -------- > Enter the first disk Number Select a target: [1-6 or RETURN to quit] 4 -------- > Enter the second disk Number Select a target: [1-6 or RETURN to quit] 6 -------- > Enter the Third disk Number Select a target: [1-6 or RETURN to quit] ---------> Just Press Enter 3 physical disks were created Select volume type: [0=Mirroring, 1=Striping, default is 0] 1 Select volume size: [1 to 418164 MB, default is 418164] A stripe size of 64 KB will be used Enable write caching: [Yes or No, default is No] Zero the first and last blocks of the volume? [Yes or No, default is No] Skip initial volume resync? [Yes or No, default is No] Volume was created RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit]
5. Check the volume status.
1. Show volumes 2. Show physical disks 3. Get volume state 4. Wait for volume resync to complete 23. Replace physical disk 26. Disable drive firmware update mode 27. Enable drive firmware update mode 30. Create volume 31. Delete volume 32. Change volume settings 33. Change volume name 50. Create hot spare 51. Delete hot spare 99. Reset port e Enable expert mode in menus p Enable paged mode w Enable logging RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit] 1 1 volume is active, 3 physical disks are active Volume 1 is Bus 0 Target 6, Type IS (Integrated Striping) Volume Name: Volume WWID: 08703748d1be5b0e Volume State: optimal, enabled Volume Settings: write caching disabled, auto configure Volume Size 418164 MB, Stripe Size 64 KB, 3 Members Member 0 is PhysDisk 0 (Bus 0 Target 9) Member 1 is PhysDisk 3 (Bus 0 Target 7) Member 2 is PhysDisk 4 (Bus 0 Target 11) RAID actions menu, select an option: [1-99 or e/p/w or 0 to quit] 0 Main menu, select an option: [1-99 or e/p/w or 0 to quit] 0
We have successfully re-created the RAID 0 volume from lsiutil .
6. Check the new logical RAID 0 volume using lsscsi.
UA-RHEL-6# lsscsi [1:0:1:0] disk CISCO-UA TBE2846RC SC19 - [1:0:2:0] disk CISCO-UA TBE2846RC SC19 - [1:0:3:0] disk CISCO-UA TBE2846RC SC19 - [1:0:4:0] disk CISCO-UA TBE2846RC SC19 - [1:0:5:0] disk CISCO-UA TBE2846RC SC19 /dev/sdi [1:0:6:0] disk CISCO-UA TBE2846RC SC19 /dev/sdg [1:0:7:0] disk CISCO-UA TBE2846RC SC19 /dev/sdh [1:1:1:0] disk LSILOGIC Logical Volume 3000 /dev/sdj [1:1:3:0] disk LSILOGIC Logical Volume 3000 /dev/sdf UA-RHEL-6#
/dev/sdi has been replaced successfully by /dev/sdf. LSILOGIC Logical Volume – Represents the RAID logical volume. Create the new filesystem and restore the data from backup which you have lost.
Hope article is informative to you . Thank you for visiting UnixArena.
Leave a Reply