The client in this data recovery case came to us to get data recovered from their crashed Dell T420 server. This Dell T420 server used a RAID-5 configuration. It had eight two-terabyte Seagate hard drives for a combined total of 14 terabytes of storage space. This was plenty of space to store their SQL database, as well as their other assorted files. And with RAID-5’s single-disk fault tolerance, their data was well-protected.
But, as we’ve discussed quite a few times on the Gillware blog, RAID fault tolerance on its own isn’t always enough to protect your data from disaster. And, unfortunately, it wasn’t enough here. After the Dell T420 server crashed, our client-to-be took a look at it. BIOS troubleshooting dialogues showed two culprits for the server failure. Six of the eight hard drives were online, as they should have been. But the two remaining hard drives had crashed. One came with the error message, “State Failed”. The other came with the error message, “State Foreign”.
With two hard drives down, the RAID-5 server’s XOR parity-based fault tolerance couldn’t plug up all of the holes in the server’s logical volume, causing a fatal server crash. To get their data back and their small business up and running, the owner called upon the server data recovery experts here at Gillware.
Dell T420 Data Recovery Case Study: State Foreign Error
Server Model: Dell T420 PERC H700 RAID Controller
RAID Level: RAID-5
Drive Model: Seagate ST2000DM001 (x8)
Total Capacity: 14 TB
Operating/File System: Windows Server 2008
Data Loss Situation: Two drives went offline. BIOS showed one drive with a “State Foreign” error and the other with a “State Failed” error.
Type of Data Recovered: SQL database
Binary Read: 100%
Gillware Data Recovery Case Rating: 10
When a hard disk in your server comes up as “Failed” in the BIOS, it’s easy enough to understand what the BIOS is telling you. A failed hard disk is one that’s up and died on you, in one of the many ways hard disk drives can die.
What does it mean when a hard disk has a “Foreign” status error? This error can be seen in other models of servers as a “Foreign Configuration”. This, too, signifies a kind of hard disk failure, but a slightly different kind of failure than the other one.
As British novelist Leslie Poe Hartley famously began his 1947 novel Eustace and Hilda, “The past is a foreign country: they do things differently there.”
Sometimes your hard drive can be a foreign country, too. It does things differently.
A computer expects a hard drive to be set up a certain way in order to read it. If the drive’s logical architecture becomes damaged or corrupted, that drive will seem to the system to be… well, a foreign country. And the computer doesn’t know how to handle it.
Plug a hard drive with a “Foreign Configuration” error into your computer and it’ll probably prompt you to format the drive. If there’s nothing physically wrong with the HDD, the culprit is probably a corrupted boot sector or other piece of important logical metadata that makes the hard drive seemingly blank. Of course, all your data still lives on there (which is why if your hard drive suddenly starts asking you to format it out of the blue, you shouldn’t do it).
Fortunately, while your hard drive may be a foreign country, here in our data recovery lab, we have the tools and expertise needed to speak its language.
When this Dell T420 server came into our data recovery lab, our expert engineers diagnosed the two bad hard drives. The one with the “Failed” status suffered from bad read/write heads, causing some damage to its platters. By running the platters through our burnishing tools and replacing the heads with a fresh set, our engineers could successfully read and image 95% of the drive’s contents.
As for the hard disk drive with the “Foreign” status, it was physically healthy. But a logical error had turned its data into what the Dell T420 server could only assume was gibberish (although we knew better). Our engineers successfully imaged the entire hard drive and sent it on its way to our RAID recovery technicians. To recover data from this RAID-5 array, our RAID recovery experts had to carefully examine the RAID metadata for clues as to how the drives fit together. Then, we had to stitch them together properly according to those clues.
Putting the server back together only required one of the two failed drives, though, due to RAID-5’s parity. This is good, because when two drives fail, they rarely do so at the same time. As a result, the first drive to fail ends up filled with “stale” data, which we want to avoid using to repair the RAID array as long as we have a choice in the matter.
As it turned out, the mechanically failed drive had failed first. This meant our RAID recovery technicians could safely exclude it from the recovery efforts from here on out. With the remaining drives, we had a 100% read of the RAID-5 array and could successfully salvage their SQL database. This Dell T420 server data recovery case garnered a perfect 10 on our case rating scale.