ESXi Data Recovery

Virtual machines are created and managed by a piece of software called a “hypervisor”. There are two kinds of hypervisors: Type I and Type II. A Type II hypervisor is a program or application that runs inside your operating system, just like your web browser. Type I hypervisors run at a level right underneath the operating system. This gives them exclusive access to nearly every aspect of their host’s hardware. Type I hypervisors like VMWare ESXi are designed with enterprise users in mind. If your business or organization needs data recovered from an ESXi virtual environment, our ESXi data recovery experts can help.

How Does VMWare ESXi Work?

One of the jobs of a hypervisor, is to manage your host machine’s resources on your guest machines’ behalf. Type I and Type II hypervisors manage this in different ways. And even among the same type of hypervisors, different hypervisors use different routes to reach their goals. Like Microsoft’s Hyper-V hypervisor, VMWare ESXi is a Type I “bare metal” hypervisor. While there are many similarities in the way both operate, ESXi takes a different approach.

ESXi is a lighter, more compact version of VMWare’s ESX hypervisor. ESXi installs what VMWare refers to as “VMkernel” onto the bare server. VMkernel is a microkernel, meaning it only has the bare minimum of features needed to comprise an operating system. Unlike ESX, which uses 2 GB of disk space, ESXi has a disk “footprint” of only 32 megabytes. ESXi can be installed to and booted from a USB drive or SD card instead of the server itself.

VMkernel has direct access to the server’s CPU and memory, as well as other hardware devices. VMkernel formats the storage space in the server with the proprietary VMFS filesystem. VMFS is a cluster file system, and allows multiple hosts to access the same logical unit number simultaneously. In this space, the user can create as many virtual machines as they want.

On the outside, one VMWare ESXi virtual machine is a single large VMDK file. On the inside, that file appears to be an entire computer. It has its own file system and operating system. To the end user, its behavior is exactly identical to a normal computer.

VMWare ESXi Purple Screen of Death

In the event of a critical error and failure of the VMkernel, ESXi can display what users have nicknamed the “Purple Screen of Death”. These can occur due to any sort of hardware failure or kernel panic. Some can be fixed by a simple reboot. Other problems can be fixed by replacing a faulty memory stick, the CPU, or the motherboard. Sometimes the problem has to do with the disks your virtual machines are stored on. Many times the origin of the problem is upstream of ESXi with an inability of a SAN to properly present an iSCSI target file.

ESXi data recovery purple screen of death

ESXi Purple Screen of Death

VMWare ESXi Data Recovery

VMWare ESX and ESXi are enterprise-class hypervisors. This means they typically see use on enterprise-class servers and SANs like the Dell PowerEdge or Synology RackStation. The servers we see for ESXi data recovery typically contain between four and one or two dozen hard drives. These drives are usually arranged in a RAID-5 or RAID-6 array, or a nested RAID-10. Nested RAID arrays with extreme fault tolerance are less common, but we do see them on occasion.

There are many ways a server or SAN can fail. When these failures happen due to the hard drives inside them, you could lose valuable data from your ESXi virtual machines. It may seem unlikely that your RAID-6 or RAID-10 server might see enough drives fail to make it crash. But at Gillware, we’ve learned over thousands of server recovery cases that it isn’t so unlikely. Servers fail every day. Even if your server has two or even three drives’ worth of redundancy, it can fail. We’ve seen cases in which four drives in a RAID-10 server failed at once because of a power surge.

Not only can the physical disks fail. A whole host of logical problems can lead to data loss as well. VMDK files can be deleted, or reset and reformatted. Data corruption can occur to or within ESXi VMDK files as well.

Error messages such as “WARNING: FS3: 1575: Lock corruption detected at offset 0xc10000” in your vmkernel.log file is evidence of hard drive failure.

Error messages such as “WARNING: FS3: 1575: Lock corruption detected at offset 0xc10000” in your vmkernel.log file is evidence of hard drive failure.

Two Stages of Data Recovery

When a server or SAN comes to us for ESXi data recovery servicing, there are two data recovery cases. The first recovery case involves recovering the VMDK files themselves. The second case involves recovering the data from the ESXi virtual environments. To recover the VMDK files, our RAID data recovery experts have to repair the failed hard drives and rebuilt the array.

Our engineers strive to create write-blocked forensic images of 100% of the binary bits on each drive. But data recovery doesn’t always work out so smoothly. In many cases, all but a handful of drives are completely healthy. But the ones that aren’t require extensive work in our cleanroom area. They may have varying degrees of damage on their data storage hard disk platters. This damage can make a 100% recovery impossible.

Our RAID engineers reconstruct the array using our forensic images. Our technicians analyze the RAID metadata and write custom software to recreate the array. There can be gaps in the data, depending on whether any portions of the array were unrecoverable. Even if 99.9% of the array was recovered, the missing 0.1% could be anywhere. So we aren’t done yet.

The next step in the ESXi data recovery process is to turn all of your critical virtual machines into physical machines. We mount the VMDK files onto our own hard drives and analyze them using our proprietary forensic software. We use the status mapping from the recovered RAID in order to get as accurate a result as possible. The final step is to comb through the formerly-virtual machines and test the recovered data. Our ESXi data recovery engineers can see which files have been recovered, which haven’t, and which have been partially recovered. We can even determine the level of file corruption.

Why Choose Gillware for to Recover Data from ESXi Virtual Machines?

At Gillware, we have ESXi data recovery experts who understand exactly how ESXi works on a fundamental level. Our experts have handled thousands of data recovery cases. They’ve racked up tens of thousands of hours of experience over the years. We make our data recovery experts’ skills available with no upfront charges. In fact, our entire ESXi data recovery process is financially risk-free.

We charge no fees, upfront or otherwise, for evaluation, and even cover inbound shipping. The evaluation process typically takes less than two business days. Afterward, we present you with a price quote and probability of success. We only move on with the recovery if you approve the quote. And we don’t send you a bill until we’ve recovered your critical data. There are no fees if you back out after the evaluation or if we don’t recover your important data. When the ESXi data recovery process is complete, we then extract your data to a healthy, password-protected hard drive. We ship the hard drive to you, and to make sure your data is secure, only you get the password.

We also offer expedited emergency ESXi data recovery services. Evaluations for expedited ESXi data recovery cases are finished in a matter of hours. Emergency ESXi data recovery cases can be turned around in less than two business days. There is an additional charge for expedited service added to the bill. But we still stand by our financially risk-free, “no data, no charge” policy.

Still not convinced? Check out some of these case studies for ESXi data recovery...

PSOD Data Recovery

When you use VMWare ESXi for your server, you may be unfortunate enough to encounter a PSOD. A PSOD, or “Purple Screen of Death”, is a diagnostic […]

CONTINUE READING

VMFS Recovery Case Study: RAID-5 Failure

In this data recovery case study, the client had a failed RAID-5 array in their server. The array consisted of four enterprise-grade Seagate Constellation hard drives […]

CONTINUE READING

IBM Storwize Data Recovery Case Study: Storwize V3700 Business Server

This crashed server came to Gillware all the way from Spain. The client had an IBM Storwize V3700 server machine with eighteen enterprise-grade hard drives that […]

CONTINUE READING

Dell EqualLogic PS4100 Data Recovery Case Study: 24-Drive RAID-50

The client in this case had a Dell EqualLogic PS4100 SAN. This SAN was filled with 24 300-GB enterprise-grade Seagate hard drives. These 24 hard drives […]

CONTINUE READING

VMWare ESXi Data Recovery Case Study: 4-Drive RAID-5 Array

In this case, our client had four Western Digital WD6000BKHG-18A29V0 hard drives arranged in a RAID-5 array. On that array, they had stored several virtual machines. […]

CONTINUE READING
//]]>