Starting out in DW

A little while back I posted a piece on how I got into databases in general and data warehouses in particular. One reader asked if I could reveal how to start out in data warehousing. Rather than posting a direct reply to the question, I thought that I would write it as a new post – mainly so that others can chip in with their advice (or to tell me that I am wrong).

Firstly, I am not a DBA. At heart, I am a developer, and as such I am ill suited to tell people how to become a good DBA, especially in specialist areas such as data warehouses. But I have worked in data warehousing for quite a while now and have managed developers and DBAs for a long time so (maybe) know a few things.

In the scheme of things, data warehouses are still not that common; there are far, far fewer DW systems than OLTP ones. In part this could be because storage used to be very costly, or, perhaps, older databases had less friendly features for large data set manipulation. But fewer systems mean fewer DBAs and inevitably less opportunity for newcomers to learn.

There is a core set of competencies needed by any DBA: software installation, patching, instance management, backup and recovery etc. There is a point however where the skill requirements (or more correctly, skill application) of a DW DBA diverge from those of a DBA working on systems characterised by short transaction times. Tuning, summary management, table design, and indexing strategies immediately spring to mind as step-changes in the DBA mindset.

Although books such as Oracle’s Data Warehousing Guide and David Aldridge’s excellent piece give the background, practical experience is important, some things can be observed on a small system (I use a laptop PC to test concepts) – after all you don’t need a billion rows to test materialized views, dimensions and query rewrite. But using a laptop to look at look at balancing read/write performance across many disks is not on. For this you need a real system and access to an experienced mentor. This mentor could be an experienced work colleague or even a virtual mentor - the sort you find in the better fora

(to be revised and expanded when I am less tired)