Why Use ODI over PL/SQL?

Pipelines
Photo by Victor / Unsplash

If you had to write ETL or data pipelines to transform data in a structured data warehouse or lakehouse - would you choose Oracle Data Integrator (ODI) or PL/SQL?

Let me elaborate.

I'm assuming you are working in OCI and the target is an Autonomous Data Warehouse (ADW). If you are loading data into an Oracle Cloud database, there is no license cost for ODI, however, you do need to pay for a compute node.

For the use case of bulk or batch processing, I can think of the following reasons why ODI is a better choice than PL/SQL.

Design Patterns and Standards

ODI pushes developers down a route of more standardised design patterns.

This means that the operators it supports are common in bulk data transformation. Examples are filters, joins, aggregation and expressions. The resulting code is likely to execute well with larger volumes of data. Conversely row-based processing and updates, which can both be detrimental to bulk data transformations, are ‘harder’ to implement.

This approach limits convoluted designs. If the tool won’t support the approach you are taking, it is worth asking whether that is the right approach.

Breaking ETL or data transformation down into a set of understandable operators makes it easier for the tool to be understood and used by a wider audience. If you know the typical set operations for data and you can build mappings in ODI. For analysts and other “non data engineers” looking at the code, it is a lot easier to understand the business logic looking at an ODI Mapping than a PL/SQL Procedure.

Logical and Physical Separation

Taking this a step further, ODI provides logical and physical separation. This means programming is declarative. You declare what you want to happen, not how it is done. If you are writing PL/SQL you have to write what needs to be done and how it is done.

There are a number of benefits to this approach.

(1) Is it closer to a low-code environment. A lot of programming and development environments aim to lessen the work of the developer. They offer frameworks to expedite code generation and abstract the developer from the low-level constructs in the code. The benefits are greater productivity, easier debugging, fewer errors and reduced maintenance costs.

(2) As the Oracle database evolves and new features become available, the code generated by ODI can automatically adapt to take advantage of them. Similarly, if the type of storage of your target changes, then the code will automatically reflect this. Again, this leads to greater productivity and easier maintenance.

(3) Repeated patterns such as merges and Slow Changing Dimensions can be configured instead of coded. This drastically reduces the developer’s workload, and provides a consistent implementation across the entire platform.

(4) Since the code is separated from the physical connections, it is possible to use different source systems and targets by utilising a context. This provides greater flexibility when testing different data sets.

Instrumentation, Error Handling and Resumability

ODI provides built-in logging, error handling and resumability. Again, this is something that your development team does not have to hand code and maintain.

It is possible to implement these functions in PL/SQL, for example, there is an open source version of the logger package, however these need to be manually built in to the code. However stringent your development standards, it is unlikely that everyone will follow them and adopt the same approach.

In addition, as the ODI ‘code’ is stored in a repository you can derive the lineage for ODI mappings. This is near impossible in PL/SQL.

Sources and Connections

ODI provides a large number of out-of-the-box connections to a wide range of data sources. With PL/SQL, you are limited to file access and database links.

In addition, as per the section of Logical and Physical Separation, these need to be hand-coded and manually maintained in PL/SQL.

Orchestration

ODI provides an orchestration layer to enable mappings to be executed in parallel and for logic to be applied after the execution of each step.

This feature is the foundation for resumability and can enable you to optimise the resources on the server to maximise throughput and hence minimise ETL runtimes.

Code Promotion

ODI is setup in production with only compiled code. This helps reduce developers skipping the release process and hot fixing directly in the production environment.

But What About Code vs GUIs

One argument for using code over a GUI-based tool is that an experienced developer can write code faster than using a GUI. This is often true when lots of repeated tasks are required. Code can be copied and pasted or code generation can be automated.

ODI provides reusable mappings in place of copy and paste. You can automate code generation using Groovy and the SDK. We have successfully built tools that allow you to generate ingestion mappings for all the files in a directory, for example.

For that reason, I would argue there is no greater benefit for automating PL/SQL, and for all the reasons above, generating mappings in ODI is more beneficial than generating PL/SQL procedures.

What About LLMs

This is an emerging area. It is now possible to generate code using LLMs. It would be much easier to generate PL/SQL than ODI. PL/SQL is much more ubiquitous and is a text-based language, there are lots of training examples on the internet.

I would have concerns about the correctness and standardisation of code. It would be easy to end up with a codebase that doesn’t follow a consistent approach and would be an overhead to maintain.

Usability

One downside of running ODI in the OCI is how and where to run the ODI Client. Networking rules can mean that Remote Desktop access is required.

It can be slow and frustrating. Code generation can take time if there is network latency between the repository database and the ODI Client.

Enter Oracle Data Transforms

Oracle Data Transforms is Oracle's cloud-native port of ODI. It is currently available, however not all the ODI functionality and features have yet been implemented.

When Transforms reaches parity with ODI, you will be able to get the benefits of ODI listed above, with a native, browser-based cloud interface.

As Transforms is integrated with ADW then there is no compute node to maintain and, as per other cloud services, upgrading and patching will be done automatically.