Data Warehouse bits and bobs
One snippet of news that I picked up on this week is that Netezza is going public with its IPO, this is the first of the appliance vendors to do so (someone will tell me I'm wrong on this)
Elsewhere I read some research that puts some 50% of data warehouses in the one terabyte or less size range, less than 5% are over 25 TB. Looking at the graphs in the article less, than three-quarters of all data warehouses are of six or more terabytes in size. Other sources tell us that the biggest databases are much larger than the those of two or four years ago. But in reality these bigger-yet databases tend to be 'new entries" in the size league; existing data warehouse tend not to grow rapidly. There may be some exceptions of course where someone was part way through a phased DW implementation and has just rolled out a whole new class of data or perhaps has just acquired a new company and has amalgamated two large data stores, but the size of most DW systems is either static or growing modestly (in part this could be attributed to rolling windows of data, new data volumes being similar to the amount being aged out)
The large number of "small" data warehouses is a positive sign for people like me, and the fact that they are small is also, perhaps, an indicator that they are built on traditional relational database technology such as Oracle and Microsoft SQL Server. But a worrying statistic is the number of people that are dissatisfied with performance and data quality of what they have.
Perhaps I should start a business reconditioning existing data warehouses.