Monday 19 April 2021

Why is Backup so hard?

Performing a backup never used to be hard, in fact, it used to be the easiest job a sysadmin could perform. So, the ever present question many sysadmins end up asking is: Why is something that used to be so easy, now so difficult? The next equally valid question is:

Why is it also so EXPENSIVE?

As with most things, we are dealing with a history lesson.

In the early days of mainframes, you dealt with a monolithic system. Everything (and I mean everything!) came from a single vendor - and that included a backup system.

How the backup system actually worked was never much of a question. It just worked OOTB (Out Of The Box). Despite this, there was a reasonable degree of busy work involved. In fact, the primary job of the sysadmin was backups.

Consider the situation: You have several rows of IBM 72x TBUs all purring away madly. Now the tape software may know what tape it is writing to, but these are reel-reel tapes. There are no barcode labels to be read and no link between the media header on the tape and what you write on the tape (and it's box). You have to be super organised. A bell (literally a bell!) will go off on the operator console, you read the message to remove tape xyz01 from TBU 23 and replace it with tape qwe12 etc. So you run around like a blue-assed fly to make sure the tapes are properly stored in sequence and you know where all the catalog tapes are just in case the storage decides to die - a not uncommon occurrence.

In fact, in days of very expensive storage, tape was considered the mainstay. Several units were often dedicated for "offline" and "standby" storage. The only difference between the two was that standby storage was on tapes that hadn't been removed yet. Often the TBUs would double up - standby during the day and backup operation at night. Tape was cheap, disk was expensive.

The popularity of UNIX destroyed proprietary backup systems. On a UNIX system, a TBU was just another device to be written to or read from. The 'tar' utility (Tape ARchiver) turned one an entire subdirectory structure into a single file for writing (or reading from) a tape device. For raw block level backups of a disk volume, "dump" and "restore" were popular from early days of UNIX. All of these commands were frequently used in script. Sysadmins developed their own scripts and cronjobs. Usually, very little effort was put into documenting these scripts. The sysadmin 'knew' what was going on - and that was all that mattered.

As the desktop computer began pushing into the business computer market, initially little thought was put towards backups. You made copies of floppies and if you were lucky enough to have a hard disk drive, you just manually copied files onto floppy disk. Some early backup systems began to appear like Fastback which offered full duplex support and compression.

As the size of desktop computer HDDs increased, the ability to 'back up to floppy' dropped off considerably. Once again, proprietary backup systems began to appear. One of these was the 4mm Colorado Jumbo, which was an inexpensive TBU that connected to the FDD interface - meaning a separate controller did not need to be purchased. Colorado bundled some backup software - but it wasn't particularly good.

Around this time, Novell Netware 2 and later Netware 3.12 was increasing in popularity. Vendors would sell Netware 5 user version and install it on a glorified workstation. It was common to install either a Colorado Jumbo TBU or a more expensive QIC TBU. The latter would require a SCSI card. 

However, Netware didn't have a native backup software.

About the only backup software that was available for Netware was ArcServe. 

It was horrible.

Sure, it wasn't too bad at backing up your files. It just a hard time restoring them. It also had a habit of crashing servers. On Netware this was called an ABEND. Pretty soon, people the idea of setting up a dedicated backup server, so if the backup server crashed, it wasn't too bad. People also found that particular SCSI cards and TBUs caused Arcserve to be more stable than others. Particular servers were also kinder. Unfortunately, this "stability" came at a price.

Arcserve wasn't too expensive - which was its only real saving grace. The other was the complete lack of any alternative. When Windows NT came out, Arcserve made a version for it - which managed to port all the same stability problems to the new platform.

Then came Backup Exec. It was a breath of fresh air! Simple. Stable. Affordable.

Okay, it was more expensive, but you didn't need a dedicated backup server. In fact, there was a cut down version that came with a SCSI card and a Colorado Travan TBU. It all worked out of the box for under $500! Overnight, pretty much everyone switched to Backup Exec.

Successive versions added more options: open file, DR, Groupwise, BTree, NDS, Windows, Unix, Linux. However the stability gradually fell away. It was still good, but the dedicated backup server was resurrected. The price increased too. It doubled, then tripled. Soon, the best thing you could really say about Backup Exec was that it was better than ArcServe - which by now had been purchased by Computer Associates and was called ArcserveIT.

Veritas (who owned Backup exec) spawned another backup product very similar to Backup Exec - Netbackup. It became the standard for heterogeneous backup. To celebrate this, they added a zero on the price tag.

Netbackup Management Console

Then came a sequence of enterprise backup products. They were all better than Backup Exec. More stable. Heterogeneous. Great support. Agents for everything you could imagine. Policy based. But by this time, they cost in the tens of thousands of dollars.

They were Syncsort, CommVault, HP Data Protector, Portlock. No sooner did one product come out, but another did that was better with extra features. 

Then the game changer of them all came out: Veeam.

It was like Backup Exec all over again. Relatively cheap. designed primarily for virtual machines, it could do what all the others struggled with: Restore a complete working server in minimal time. 

Pretty soon, Veeam became the dominant backup software. It was a little feature poor at first, but you could do cloud based backup - meaning you didn't need to buy expensive tapes or TBUs. Veeam charged very little for cloud storage.

Slowly, as successive versions of Veeam came out, features were added. However costs began to go up again and reliability and stability began to drop. Veeam also started increasing the cost of cloud restore operations - so whilst backing up to the cloud was cheap, restoring from it cost a fortune!

Now, organisations using Veeam are casting around for alternatives. There's  EMC Networker, Altaro, Nakivo plus there are appliance based systems like Datto that work on a different paradigm. They all have one thing in common:

They are very expensive.

The goal seems to be to create hassle free simple backups. Over time, software companies sem to forget that and overload their software with Netfeatures few people use.

It seems to me the solution is to create a two tier system: the simple software that always works and the heterogenous one that deals with all of the weird and wonderful situations. 

That sounds like a simple solution, unfortunately it's always the low cost product that makes the profits and drives innovation. And nobody wants to run two pieces of software. This means the larger business that pay the most for your product, don't use the cheaper versions. When Backup Exec came out, all the large organisations stuck with ArcServe. The same is true of Veeam. Netbackup continues to hold sway in larger organisations. A quick persual of the Netbackup support forums gives the distinct impression that those who administer NetBackup have only one job to do: Backup support. If that's your only job, you don't really care that much if it's difficult to administer and requires arcane knowledge and sophisticated scripting skills. It also doesn't matter that it's hyper-expensive. In fact, that expense is a good thing as it masks you salary in the TCO.

So, I fear history is doomed to continually repeat itself. Think about that the next time a new piece of backup software appears on the market that seems too good to be true.

No comments:

Post a Comment