Is it a Hardware Problem or a Software Problem?
Last night I was working with an older HP laptop and things hit the fan very quickly. I discovered I couldn’t write a file to disk and before I knew it the system had thrown up to a black screen and needed the power cut to reboot. I immediately realized that there was a serious hardware problem. Then I spent the better hours of the evening trying to salvage everything from the old drive to a new drive, only to have the new drive seemingly show the same symptoms. It is a laptop of course and so I assume it’s the bus for the drives or the cdrom. I pull out the cdrom and it seems to behave itself just fine. This morning as I was checking the last of the package updates it dawned on me how different things would have been if it were a windows laptop. The laptop runs linux and when the system froze I immediately assumed it was either the hard drive or the drive adapter to the mainboard. Why?
I’ve been using linux as my primary OS now for 10 years. I have supported windows through this time, but to be quite honest I haven’t seen a linux system freeze that wasn’t either 1) in the process of trying to do a new install (many times from invalid install media) or 2) from hardware failing. Windows though is another story entirely. I can’t count the number of times I’ve seen windows spit out a memory dump and then spend hours researching the cryptic numbers that they spit out with their fatal exception codes. Only to wind up 3-4 hours on wondering if it was failing hardware or a software problem.
Don’t get me wrong, windows really has come a long way. I think the thing that drove me over the edge was Millennium Edition on my laptop. It lost a sound card for no good reason and ate up about 10 hours of my time trying to get sound back. Have I had problems getting hardware to work with linux? At times. Usually it’s a case of a piece of hardware either not being supported or requiring manual intervention to support. I can’t think of anything that’s just stopped working for no good reason due to the software or driver configuration. I can’t recall an update that screwed up sound, or a database, or …. well any of those software peculiarities that seem to be a fact of life of windows. The sad thing is most people think “that’s just computers for you”. True, computers are imperfect. Software and hardware, but you shouldn’t have to pull hair out trying to identify if you have a hardware or software issue.
You shouldn’t have an operating system that is so buggy that you can’t discern when hardware or software is failing.
For example. We have an openvpn server deployed. It’s a very special openvpn server. It sits on a little asus eeebox with a laptop hard drive and it acts as the hub for a vpn of surveillance cameras. There’s a lot of special sauce in the firewall rules of the vpn client router boxes for this, but that’s another story. (VPN addresses get mapped to the local LAN address of machines attached to the client vpn boxes – i.e. 10.10.0.34 gets translated to 192.168.1.34 for vpn communication.)
Anyway. It pumps lots of video 24/7. Lots – gigabytes a month. Sometime in the last month we had to service the box. One of the clients couldn’t connect to it for some reason. 4 out of 5 video streams were working just fine, but the 5th couldn’t log in. SSH dumped the connection as soon as I tried to connect. Booting from a rescue usb stick I found a file system that was literally crumbling. I don’t know why but the disk had numerous issues (we suspect it may have been dropped…) there were missing links to files, symlinks to the wrong places, some items that should have been files were flagged as directories. Hosed doesn’t begin to describe what it looked like. We backtracked to when the first problems showed up and it had been running around a month still like a zombie faithfully delivering those 4 video streams that were still connected. Many important pieces of the filesystem were gone and it was still doing it’s job. Amazing.
I remember once upon a time seeing a Dos install freeze at the command prompt in the midst of typing a command and I remember then knowing that the hardware was not well. Windows has done much to muddy the waters. XP was more reliable certainly than it’s win9x or winME predecessors. Vista was a disaster though and is in the same class as WinME. Windows 7 has been better by all accounts, but for me it is too long past due. I’m way past caring about the next windows release.
So, I’ve been looking potentially to replace a laptop.and have no use for windows. So, I’ve been using this good list of places that sell linux on laptops. Unfortunately the big names either don’t or don’t offer much variety (Dell – I’m looking at you – really a grand total of 3 choices last time I looked. I think one or two may have been netbooks. I don’t want a netbook – I want a true workstation.) Of course, I could pick out a Dell and buy, but then I’d be paying the windows tax for something I won’t really use.
I may not need to purchase one though yet. After pulling out the cd drive things seemed to work (in the logging I saw a lot of spurious ata errors – I suspect the drive may have been the source. I swapped it with the drive from another long dead laptop and it seemed to work just fine (plus that other one was an actual burner as well.)
By the way, this shows you how old the laptops I use are. My estimate is that one is about 7-8 years old. Has the current release of Ubuntu on it. Acts as a remote client for my mythtv server (think media center…) So, I can watch tv on it, nice big screen… of course, the latest firefox/openoffice/chrome/calibre/etc. Try that with a 7-8 year old windows machine (fully updated to the latest version of Windows.) Good luck.
Yes, I’ll probably need to look for a new laptop soon anyway as these are older. The one that I had an issue with last night is my spare at the moment. If I lose that or the one I’m typing on I’ll need to go ahead with a new one. When I do though I plan to look for one that I expect to last for another 7-8 years (if not more… with multicore machines we seem to be plateauing in the demand for processing power from most computing tasks.) Yes, I suppose I could upgrade every 3 years like microsoft wants me to so that I can run their newest version which has been optimized to be bloated and slow on older hardware. I can just imagine that Windows 8 will require at least 2-4 cores and 8GB of memory just to show you the bootloader.
Sorry … I digress.
To get back to the original point of the story. Last night I spent about 5 hours on the system. The drive failed, I booted and installed a new OS on a new drive, spent probably 3 of those 5 hours copying 60 GB of data from one drive to the other. The last two hours were spent trying to figure how to get the fool thing to boot from usb since I ran into problems with the cd booting. (A couple of the usb plugs strangely do not support booting, at least one pair does.) Then pulling the cd drive and testing again. Then finally letting it start installing all the packages that were installed on the old drive, which now has been reclassified from bad to “?bad? – test” since I’m now blaming the cdrom drive for the issues. To be honest if it had been windows after 5 hours I’d still be trying to figure out if it was software or hardware. (I would have had to use a linux boot media on another system to get at the data on there and copy it off.) Would probably not been able to get all of Xp’s updates installed yet – this system did ship with XP. I would likely have spent a good part of this evening on the fresh XP install and have found out tonight that it happened again. At which time I’d be looking at drivers for the hardware on the hp to see if there were any known problems/available updates (for the drivers or firmware.) Then by tomorrow night I probably would have felt as though the well was dry of ideas of which driver could be at fault and might have started into pulling out the cdrom drive. (Unless it had just obviously failed earlier.)
What’s interesting is the drive worked well enough to install from once, and boot up into the newly installed system and do a good amount of data copying and updates. If it were XP I probably would have spent some time looking at the recent windows updates as well to see if there were any horror stories there, running antivirus and antimalware (and antirootkit) scans to make sure the system was clean (assuming it would stay up long enough for them to run.)
Why would I have been off chasing software issues when it really was a hardware problem. Very simply, my experience with windows in the past 15+ years has been that squirrelly behavior is typically due to windows. Sometimes hardware. I’ve seen exceptions and I’ve seen some of the symptoms of those exceptions.
As I was thinking about that this morning I really renewed my appreciation for linux, ubuntu and the fantastic operating system environment I use. I dare say millions of man hours in developing (if not billions of man hours) have produced what I feel like is the best, most stable tool that I use daily.