RAID Data Recovery
Whether you’re a small business owner with a small network attached storage (NAS) device or part of a corporate IT team taking care of a massive enterprise-class servers, you’re bound to experience a RAID crash eventually and we are here for you.
RAID (Redundant Array of Independent Disks) technology is commonplace in the IT industry, underpinning the majority of business computers, servers and storage arrays. The prospect of suffering catastrophic data loss from a RAID failure is one of the most severe data recovery issues that users, small businesses and enterprise organizations might face.
Typically, RAID has proven to be a very durable technology for data protection; in normal circumstances, RAID is very reliable. When problems do occur, the risk of losing RAID parity and potentially every single byte of data within the array is a very real possibility, especially if the user does not have the expertise to recover the array and the data contained within.
If you are experiencing problems with your RAID configuration, we highly recommend you stop using the system (to prevent further damage) and contact Gillware about our Raid Recovery Services. Our expert data engineers are available to recover your RAID data, even from the most severely damaged arrays. We are standing by ready to help you.
What is RAID?
RAID has a number of built-in data protection measures. These are designed to safeguard data residing on a physical set of disks that make up the RAID array; it does this by slicing up the data and storing it in different locations on multiple different disks. Most RAID configurations provide data redundancy to protect against disk failure, but some RAID arrays are designed purely for performance and offer no built-in redundancy.
RAID presents multiple disks to the operating system as a single logical data store and it uses disk mirroring and disk striping techniques to protect the data. Mirroring is when the data is sliced up onto more than one disk, and striping is when data is split sequentially between disks, resulting in much faster read/write performance.
A RAID controller creates an abstraction layer between the server and the hard disks. The abstraction layer is either a hardware- or software-based controller that manages I/O to the disk, handles caching, and adds extra protection in the event of a server crash. RAID controllers are pivotal in the overall operation of RAID devices, and many of the faults we discover are related to controller issues.
We are commonly sent the following types of RAID configurations to repair:
- RAID 0 (Disk striping) is all about speed. Each disk is used to concurrently read/write chunks of data spread between each disk. This setup significantly boosts disk performance but there is no data protection built-in, and if a disk fails, you will lose your data.
- RAID 1 (Mirroring) is all about data protection. Data is mirrored across at least two disks creating a 1:1 copy. As long as only one disk fails, the data is fully protected, and the array can be rebuilt by replacing the failed disk.
- RAID 5 (Striping with parity) is commonly used within enterprise storage systems, like SAN and NAS devices. Data is striped across all of the disks for high performance, but a parity block is also written to each disk. In the event of a hardware failure, the RAID can rebuild the data using the parity blocks and recover data consistency.
- RAID 10 (nested) gives high performance and data protection. The minimum configuration is 4 disks with data striped across them all, but, importantly, data is also mirrored between the disks. Therefore, in a four-disk configuration, two disks can fail and the data will still be protected.
What Causes RAID Degradation?
Despite these built-in protections, RAID is not a substitute for a strong backup strategy – any RAID configuration can (and does) fail. There are many causes for RAID degradation or failures, but, as we will discover, it is possible for our expert data engineers to recover any kind of RAID failure.
Gillware has been are shipped some of the most heavily damaged RAID arrays from clients all around the world; each case is treated individually due to the wide range of potential problems. There are many reasons why a RAID array might fail. Nearly all of these issues are triggered by improper shutdowns, system crashes, power outages or hard drive failures.
We have discovered that the most common causes of failure to come to us are:
- A RAID controller failure – the controller card manages the I/O access between the hard disks and the operating system. A catastrophic failure, such as a power surge, or component failure can destroy a non-redundant RAID array, resulting in disks that cannot be read.
- Operating System not found – this common problem occurs when the RAID controller does not know how to boot the operating system; causes are wide-ranging but typically a result of a configuration error or server hardware error.
- Multiple disk failures – if the RAID suffers more disk failures than the RAID is fault tolerant for, it can damage data on the array and cause corruption. For example, if you had a RAID 10 configuration and three of four disks failed, it would destroy the array and the controller would not know how to rebuild the data.
- Server fault – a problem with the host server can also unintentionally destroy a RAID array. A common example may be the motherboard failure or SATA controller damage on a local RAID setup.
- Foreign array – if a software or hardware controller crashes, it will sometimes detect the RAID as a foreign array. This is caused by an unexpected fault such as a firmware bug or errors in the controller software.
- RAID rebuild failure – after a failed disk is replaced in RAID 5 configurations, the controller will attempt to rebuild the RAID array. This rebuild can fail if the incorrect disk type is used or if a disk with down-level firmware is inserted as a hot spare.
How Does Gillware Recover RAID Data?
After the client has shipped the failing device to our data recovery lab located in Madison, Wisconsin, the engineers will catalogue and perform an initial assessment of the fault. We will perform an early diagnosis where we will attempt to determine the root cause of the failure.
The experts at Gillware have extensive knowledge of all popular models of SAN, NAS and server hardware from all of the major manufacturers. Brands such as Synology, Dell, HP, IBM, XenServer, SnapServer, Buffalo, Drobo, and FreeNAS to name a few.
Server RAID Data Recovery
A typical server will have locally attached storage with the Operating System and a few core applications installed. Data is commonly striped with parity to create redundancy. RAID 1, RAID 5 and RAID 10 configurations are commonly deployed in these scenarios.
A fault on the local storage would result in the server not booting. You may encounter errors like “Operating System Not Found,” or you may see specific errors relating to a damaged data array. The most common cause here is the RAID controller failing. If a RAID controller fails, you will lose the configuration data of the entire RAID. This can result in what appears to be a dead server where all the data is missing.
Gillware can reverse-engineer the array configuration and write a custom emulated raid controller from the metadata recovered from the original hard disks. Emulation creates a software (virtual) controller that can attach the data blocks together into a readable format. Our team analyzes what data is recoverable and we will make changes to the configuration if bits of data are still missing.
Our in-house proprietary recovery software, HOMBRE, will mount the virtual controller and present the metadata in a readable format to an alternative Gillware server, which will enable our engineers to copy the data off the damaged disk array. This is done over an ultra-fast network connection to speed up the process and reduce the risk of failure.
Once we have made the data safe, we can run further checks to determine the root cause of the failure. We have vast reserves of spare parts and we can swap components that appear to be failed to help pinpoint exactly what happened. In most circumstances, we recommend that our clients replace the original hardware. We then arrange to securely transport the recovered data back to the customer.
NAS RAID Data Recovery
NAS devices are very popular as an affordable network attached data solution. NAS acts as a private cloud for a company where information is shared easily and securely among employees. NAS devices can keep their owners informed of any hardware issues. An email alert system can be set up where the hard drive sends out an automated email stating drive failure. Many manufacturers will also include instructions in the automated “drive failure” email detailing the steps you must take to get a replacement drive.
NAS uses a wide range of RAID configuration. Often brands such as Netgear and Synology use a proprietary RAID setup; therefore, Gillware might have to approach NAS data recovery slightly differently. The most common issue we discover is that of hardware failure on the NAS, whether it be a disk or a component inside the NAS.
If a customer was not aware of a failed hard disk which broke down many months ago and then another disk fails today, the customer is faced with a total data loss scenario. In these types of circumstances, Gillware will perform testing and fixes on the NAS appliance to ensure it is fully functioning; we then look at the disks, as the fault will typically be caused by disk.
We determine the fault and we have specialist hardware available to deconstruct the drive in one of our certified clean rooms and look inside the drive. We might find issues such as a parked disk read head or maybe a damaged disk platter.
Storage Attached Networks (SAN) RAID Data Recovery
If a client SAN fails, our engineers know that we have a real challenge on our hands to get the data back. SANs are designed by the manufacturers to be highly fault tolerant and resilient to hardware failures.
A SAN device is used as the backbone of many businesses storage needs and stores valuable production data, often entire virtual server infrastructure and core business systems. Enterprise class solutions such as IBM Storwize, Dell Equilogic and Netapp are popular choices for SAN storage.
Again, failed disks are usually the primary cause of SAN RAID failures; when a single drive in a RAID 5 fails, the array has to regenerate the content of the drive using the parity data on the rest of the drives. Often we see RAID 5’s health stated as “degraded.”
If this is not spotted, the RAID drive can become a stale drive. As time passes and the degraded server continues its operations, the data trapped on the failed drives becomes increasingly out-of-date, potentially forcing stale data back into a RAID array and causing massive data corruption.
In these types of failures, we need to make the failed drive healthy enough for the RAID to acknowledge the device. This work can be undertaken by our clean room engineers who will physically repair the disk. Once the RAID accepts the disk, our data engineers must act quickly to clone the RAID data before the RAID completely fails.
If you need RAID data recovery, you need it done by experts. Here at Gillware, we provide fast, affordable, and customer-friendly RAID data recovery services. We offer our expert services with a financially risk-free guarantee. If we can’t get your data back, you don’t pay. Period. We offer free evaluations in our cleanroom lab, and even provide free inbound shipping. That way, you can be sure that we can recover your data and not waste your time or your money.
Our data recovery experts work in an ISO-5 Class 100 certified cleanroom, to ensure there is no further damage to your RAID equipment. We have SOC 2 Type II audited data recovery facilities, which means you can put your trust in us to handle your data with the utmost care and discretion. Gillware’s RAID data recovery experts can recover your data for you. Contact us today to get a RAID data recovery estimate and arrange a free evaluation with our engineers.