Comparing Informatica And OWB
A couple of projects that I've worked on recently had chosen Oracle Warehouse Builder over Informatica's Powercenter. It was interesting therefore to see a new article by Rajan Chandras that looked at the latest version of Powercenter, and to compare how Informatica's offering compared to OWB10g.
According to "(Re)Enter The Leader", Powercenter has a similar architecture to OWB:
"PowerCenter 7.0 (I'll call it PC7) is an ETL tool in the classic mold: data extract, transform, and load logic is constructed in a (mostly) sequential arrangement of graphical objects that flow from source to target. The objective is conceptually simple: Read data from source, transform it as needed, and write it to target. Reality is a little more complex, of course, and the construction of logic happens at three levels.
At the lowest level, individual graphical objects can be sources, targets, or transformations (sources and targets can be themselves considered as special types of transformations). A source transformation is used to read from a data source, and supply that data in sequential row-wise fashion for subsequent processing. At the other end of the logic stream, the target transformation receives data (again, in row-wise order) and writes it out to recipient data structures. The remaining intermediate transformations do just that transform data values as required.
Sources, targets, and transformations are assembled in a daisy chain to form the next level of processing, which in PC7 is called the "mapping." A mapping is the end-to-end flow of logic, from one or more source transformation to one or more target transformations.
The execution of the mapping, called the "workflow" in PC7, provides the third level of the overall logic. The workflow provides for the execution of multiple mappings and dependencies among mappings. In standard programming terms, the transforms are the syntax and components of the program, the mapping is the overall program itself, and the workflow is the execution and production of one or more programs.
There are PC7 components that correspond to these levels. The PowerCenter Designer is the programming integrated development environment (IDE), where you "assemble" all the sources, targets, and transformations to create a mapping. The PowerCenter Workflow Manager is used to build a workflow around the mapping. The Workflow Monitor provides production support capabilities for the workflow. In addition, there are the PowerCenter Repository Manager and the Repository Server Manager, which provide administration capabilities for the PC7 Repository (more on the this a little later)."
All sounds familiar, with Powercenter's graphic objects being the source and target objects you drop on to a mapping canvas, together with the PL/SQL transformations, mappings being the same in the two tools, and Powercenter's workflow being the same as the workflow interface in OWB10g.
Reading the rest of the article, some interesting similarities and differences were:
- Informatica Powerexchange seems similar to Oracle gateways, but with connectors to Peoplesoft and Siebel in addition to the SAP connectivity that both tools offer. Support is good in both tools for non-Oracle databases (DB2, SQL Server, Teradata, Sybase and so on)
- One major difference is that OWB will only populate Oracle 8i, 9i or 10g data warehouses, whilst Informatica works against any major vendor (thanks Duncan for pointing that one out, one of those 'so obvious if you're used to OWB, I forgot to mention it' moments...)
- Both tools allow you to built reusable components for transforming data, with Powercenter's being specific to the tool whilst Oracle's are regular PL/SQL functions and procedures.
- Informatica, like Oracle, are making a big noise about grid computing. "PC7 offers server grid capabilities, too, by which PowerCenter can distribute loads across heterogeneous Unix, Windows, or Linux-based computing platforms. Although grid capabilities may seem exciting, I don't believe they match real-world need for grid computing yet, and I wouldn't recommend using them in place of other industry grid solutions."
- The main architectural different between Powercenter and OWB is that Powercenter has it's own ETL engine, that sits on top of the source and target databases and does it's own data movement and transformation, whilst OWB uses SQL and the built-in ETL functions in 9i and 10g to move and transform data. Interestingly the article observes that the Informatica approach can be slower than the approach used with OWB. "Also, be aware that ETL tools are in general a slower (if more elegant) alternative to native SQL processing (such as Oracle PL*SQL or Microsoft Transact SQL)."
- Powercenter's use of web services and distributed computing looks more developed than OWB's. "PowerCenter Web services are managed through the Web Services Hub, another component of the architecture, which supports standards such as Simple Object Access Protocol (SOAP), Web Services Description Language (WSDL), and Universal Description, Discovery, and Integration (UDDI). The architectural components can be collocated on a single server or spread across diverse servers, which allows solution parallelism, flexibility, and scalability."
- Powercenter starts at around $200,000 (yikes!) although there is a "Flexible pricing model.". OWB is licensed as part of 10gDS which is around $5000 per named user, although you'll need the Enterprise Edition of the 8i, 9i or 10g database to provide the ETL functionality.
Historically, customers chose Informatica when they had lots of different database sources and targets, and the transformations between them were complex and interwoven. At one point, if you wanted a graphical interface for building transformations, tools such as Informatica, Constellar, Genio and so on were the only game in town, and you were looking at a minimum investment of between $50,000 and $100,000 to deploy these tools. The introduction of DTS by Microsoft and OWB by Oracle suddenly changed the market by providing much of the functionality of these tools as either a free database component or as a low-cost add-on. Vendors like Informatica have responded by introducing additional new features (such as web services integration, distributed loading and transformation, and so on) but it's now the case that, if you have a fairly straightforward need to graphically extract, transform and load data, you'll probably find the vast majority of your needs are now met by tools like OWB, at a far lower cost.
Interestingly, Informatica also have their own BI query tool called PowerAnalyzer. Sold separately from PowerCenter but designed to be used in tandem with their ETL tool, PowerAnalyzer is a web-based query tool that creates ROLAP queries against Oracle, IBM, Microsoft and Sybase datasources. Designed to be deployed using J2EE application servers, it also comes with an Excel interface and, as Seth Grimes reports for Intelligent Enterprise, a range of prebuilt analytic applications:
"The analytics dimension features conventional query and reporting and online analytic processing-style slice-and-dice analysis, and also optional packaged modules for CRM, financial, HR, supply chain, and cross-functional analysis. Its ability to visually model analytic workflows is one that's not yet common. It's intended to facilitate root-cause analysis, although this capability appears to be limited by the packaged analytic framework and other architectural strictures. For example, the highly promoted Excel support doesn't include database write-back. It also lets analysts embed business logic in private spreadsheets rather than in a repository, which can prove limiting when logic locked in a spreadsheet isn't visible to the workflow modeler and other non-Excel interface elements. Last, PowerAnalyzer delivers only the mainstream data-analysis functions that are found in competing BI tools, underscoring Informatica's view that integration and usability rather than analytic depth are the keys to market share.
PowerAnalyzer 4.0 is a credible entry in the larger BI market and will prove compelling for organizations that require an easy, nondisruptive path for integrating mainstream analytics into an existing computing environment."
That being said though, PowerAnalyzer is rarely (if ever) deployed by non-Informatica PowerCenter customers, limiting it to a fairly small audience who have a particular need for high-end ETL and specific industry analytic templates. Looks interesting, though, albeit with a hefty pricetag.