Getting Started With Informatica PowerCenter 8.1.1
In the previous postings in my Oracle BI Apps series, I've looked at the BI Apps architecture, installation and configuration, and performing the initial data load. As one of the key benefits of the BI Apps is its extensibility, one of the first things I'd want to know if I was a customer is exactly how you go about extending it. Oracle have a published methodology for adding new data into the packaged data warehouse both in terms of new rows, new columns and even new data sources, but as this is all based around the Informatica ETL tool, that Siebel originally did an OEM deal with and Oracle now support at least until they can get ODI up to speed, the first thing you're really going to need to know is how to work with Informatica. The good news is that if you're familiar with Oracle Warehouse Builder the two tools are actually fairly similar, at least in terms of the developer GUI, and so if you're comfortable with one, as I am, you'll quickly pick up the other.
So what I thought I'd do then, before diving into the methodology used for extending the data warehouse, is run through an end-to-end example of building some Informatica Mappings, creating a workflow and using it to load some data into a set of Oracle tables. The data and scenario I'm going to use is actually from our OWB class that we teach, where at one point we load data from a set of normalized tables and file into a set of staging tables - this will allow me to compare building the scenario in OWB and then in Informatica.
Informatica PowerCenter comes in both server and client versions, with all of the design work being carried out in the client. To start off the project then, I connect to the PowerCenter Repository Manager and create a new folder in my repository, which performs a similar function to a project in OWB.
Once you create your folder, you then switch over within the same application to the "Source Analyzer". This is a feature of the Repository Manager, like the Oracle Module wizards in OWB, that connects to a set of relational tables and imports the metadata into the repository. This seems to work off of ODBC rather than native Oracle connections, however when mappings run later on they can be configured to use OCI in the same way that OBIEE lets you. Suffice to say that connecting to non-Oracle sources is very easy, much easier than OWB or even ODI where you have to locate, and then fiddle around with specific JDBC drivers before you can work with these sources in a project.
Flat files are also pretty straightforward to bring in, with an Excel-style import wizard that performs much the same function as the OWB one.
Once you've imported in your sources and targets, you can review the list of tables in the Repository Manager.
When I started thinking about joins it got me thinking about how, under the covers, Informatica is doing the data integration and loading. With OWB, the database does all the integration using SQL selects, inserts, joins, table functions, merge commands and so on, whereas in Informatica you've got a separate hub engine that does the work. Now in some ways that seems to offer advantages - you can just drag and drop any data sources and targets together and the Informatica Integration Service just gets on with working out the optimal, row-by-row way to get data from A to B. With OWB, you've got to be careful about the SQL that you're getting OWB to generate which offers advantages and disadvantages - in the hands of an expert you can built some pretty fast, efficient integration, but it's fairly easy to get OWB to generate some pretty hairy SQL if you've got multiple extract and load stages in a single mapping. I suspect with Informatica the environment is more controlled, more predictable but looses some of the capability to produce super-fast ETL routines that you could potentially get with OWB. I'll have to see how things transpire once we start using the tool on more projects.
So once you've pulled all your mappings together it's time to create a workflow, which in OWB terms is a process flow. Creating a workflow is pretty similar to performing the same task in OWB, you start off by adding tasks to the workflow process, connect them to the relevant data sources (this is where you can switch to an OCI connection) and string them all together.
So, that's a simple set of mappings taken place. Of course there's a lot more to Informatica than just this, but I think now we're ready to use the tool in earnest to start customizing the BI Apps data warehouse. More on that in a couple of days.