oreilly.comSafari Books Online.Conferences.


Four Tough Lessons of System Recovery

by KIVILCIM Hindistan

Last week, I received a brand new laptop with 1.5Gb RAM, a 100GB SATA HD, and a 15.4-inch wide screen, brightview display. It has basically all the technical gizmos that can spoil a new employee.

The computer came to me installed with Windows XP Pro. My game plan was to transfer my files via a USB disk to the NTFS partition and then transfer my second partition which is Debian Sarge (so-called) Unstable, and keep up with my regular business.

My weapon of choice, when I have to use Windows, is a VMware Workstation, configured to work with the real partitions--not the loop filesystem. This means that if I change anything, my files are still there when I boot Debian.

So, I started VMware as usual, configured it to use the physical hard disk, and began my operation.

I used a USB disk to transfer my old system. After that, I began to erase my old disk, which contained the NTFS system partition (C), an NTFS data partition (D), and the Debian partition.

While doing this I first erased my D and Debian partitions with fdisk and wrote the changes to the disk.

After I exited cfdisk, I caught a glimpse of hda1, which troubled me and left me staring at the empty black screen with the root cursor wondering what was wrong. The thing was, that device should have been sda, which was the mounted USB drive, not hda--the laptop's native disk.

I turned red. I had just wiped the partition that contained my backup data and the installation files of my laptop. Fortunately, my boot partition was still there, so I just had to collect my backup data (some 60GB) from different computers and copy them again, which looked like half a day or so of work.

My First Attempt

I checked the files and saw that they were still there. Apparently, Windows does not read the partition data unless rebooted. If I moved the most important files to my USB disk, I'd have something left. Unfortunately, I could only take half of the data because my USB backup disk was already crowded with junk.

After that I tested my theory. Windows booted fine, but there was no D drive or Debian partition.

Then I began thinking (this is where things turned worse--honest). The partition was there and I did not overwrite anything, so it shouldn't be difficult rescue my files. Right?

I began looking for a program to scout my hard disk. Unfortunately, the program I found was not under active development, and the only version I found was a cracked copy. Nevertheless, I started the program and it found the partitions as expected. It told me that there was something wrong and asked if it could correct things. Of course I wanted that. Then it asked me if I was sure. Sure, sure--I sure wanted to rescue my missing partition.

And then it was okay. I just had to reboot to see... that now even Windows would not boot. It started booting with all good intentions (I'm sure), but then some error screen (which put the famous Guru Meditation screen of the good old Amiga to shame) appeared and I was ruined.

Recovering My Data, Badly

Now I not only needed to collect my backup data, but also to re-install Windows XP, which included finding a series of exotic drivers.

In these kind of situations, I always remember a famous quote from Albert Einstein "If I had 60 minutes to rescue the world, I'd spent 59 minutes to define the problem and 1 minute to solve it." I don't claim to understand the real wisdom of this quote, but for that I'm a bit on the lazy side, I like it. I humbly think that real laziness is not one of the seven deadly sins, but a virtue to earn via two mandatory tools: avoiding work cunningly and getting the job done properly at the same time.

As with every computer change, I had to transfer my vital files from the old computer to the new one. I usually have two operating systems. One is Windows for office things and the other is Debian GNU/Linux for security tools, etc. Every time I switch computers, I prefer to install Windows new and keep my good old Debian. Having switched three laptops in last six months, I've developed a nice method; after I finish installing Windows I install, VMware Workstation and from that create a virtual computer that uses the physical hard disk.

This allows me to keep working on Windows and at the same time install and transfer the Linux partition, even booting it and cleaning some glitches from the new network and other settings.

This method also has another advantage; if you suspect that you have messed something up, such as LILO or partitioning, you can easily try to boot (still from VMware) and see if everything is fine. If not, you have an already booted computer that has a network connection, CD-writer, etc., ready for backup and/or recovery procedures.

What could have gone wrong?

Almost everything, I should say.

This time I had a wonderful USB 2.0 jacket with transfer rates up to 25Mb/second, which really eased my file transfers. I simply unscrewed the old laptop's hard disk and put it into the USB jacket and plugged it into my brand new laptop.

Everything was fine as I transferred files to my NTFS partition. After I finished, I booted Knoppix 5.0 from the VMware (with physical discs mounted, as I've mentioned), and began to transfer my Debian partition, which also went wonderfully.

After everything had transferred, I wanted to erase the partitions in the old laptop, so that the new user would have a clean hard disk, but also to make sure that my files were gone for good. Windows would not fdisk the Linux partition, so I decided to use Knoppix for this too.

I started cfdisk and began erasing the two partitions (my NTFS data partition and the Debian partition). After I erased them both, my intention was to write the partition table back, then make new partitions and fill them with garbage data, which, in my case, would be caution enough.

Re-Losing My Data

As always, cfdisk wanted me to verify my choice of writing the partition table back with typing each and every letter of the word "yes", which I did without thinking.

After that, I exited cfdisk. At that last glance, I saw a small irritating detail. The hard disk that I was erasing (USB) should have been sda or sdb, but the screen showed an irritating hdb.

At that moment, I realized that I had just erased the partition in my new laptop that had my old backups and .CAB files for the installation. It was no big deal, I just had to transfer some 40Gb or so, but I was definitely upset at how I could make a mistake like that.

Out of curiosity, I clicked on D and saw that my files were still there. I realized that Windows must not re-read the partition table if it did not reboot or do anything to the partition directly. This fascinated me while I copied my most important files to my USB disk. I was on a lousy 11Mbit wireless network, so I could not think about backing up some 40GB to a network share. I also did not have enough space on my spare USB disk.

Destroying My Boot

After that, I booted the machine to see the damage.

As expected, C was there but D and the Debian partition were missing. In fact, all the data was intact and nothing was overwritten. The computer merely did not know where to find them. As I said, being on the lazy side and seeing myself as a technology-savvy work avoider with a computer, I began to search for a program that would find the exact physical location data of the missing partitions so that I could restore them. Because I was using Windows, I tried to find freeware to solve my problem.

The truth was, there wasn't any freeware. I came across gpart several times. This is a GPL-licensed console program that does exactly what I want on Linux. The true problem is that I was not properly lazy enough. I was more under the influence of a spoiled kind of laziness. gpart would only supply me with the partition table details but then leave me to build those partitions by hand. This was my second and biggest mistake.

After an hour of Internet mining, I came across a commercial program that claimed to do just what I wanted. Unfortunately, I was not ready to pay $50 for such a program, so I found an obsolete, unsupported version. I downloaded the program and started it.

It diagnosed my problem correctly and started to fix the partition table with the real values. After 50 seconds it reported everything was okay.

The performance had convinced me, so I confidently rebooted... to a blinking black screen.

Now I was done. I had not only lost D and my Debian partition, but I had destroyed the partition table somehow and my laptop would not boot from its hard disk.

Pages: 1, 2

Next Pagearrow

Linux Online Certification

Linux/Unix System Administration Certificate Series
Linux/Unix System Administration Certificate Series — This course series targets both beginning and intermediate Linux/Unix users who want to acquire advanced system administration skills, and to back those skills up with a Certificate from the University of Illinois Office of Continuing Education.

Enroll today!

Linux Resources
  • Linux Online
  • The Linux FAQ
  • Linux Kernel Archives
  • Kernel Traffic

  • Sponsored by: