Disasters

Peter Scott

Jul 11, 2005 • 1 min read

We are currently putting the final stages of a DW migration for one of our customers to bed – they have a small DW (just over half a terabyte) on old and unsupportable equipment – it was good stuff when it is was new, but now…

Like all conscientious developers we insist on a full recovery test prior to go-live. This has two purposes: to prove our recovery process works and to benchmark how long it takes to get the data back. One of my objectives in designing the recovery process was to have something that was reliable, easy to manage and quick to recover. The old system’s backup failed on all three counts. Last year they had a serious problem, made worse by a hardware engineer trashing half of the disk controllers – the only way back was a full restore. The original system was designed to backup quickly, unfortunately the reverse was not true; data was written across multiple tapes, the tapes were often in the wrong drives at the wrong time. To restore from tape took 4 days (or 8 times longer than backing it up) On the new system it took 3.5 hours, a significant improvement!

Another recovery test last week also went well – 2.4 TB restored to a DR system, that took a lot longer, especially when you take it account the time to get the DR system delivered to our data centre and the systems team to load the whole lot from tape, but still it was less than two days – pretty good really, I was pleased and so was the customer

Sign up for more like this.