Data Transforms - Transform your data and roll out!

Data Transforms - Transform your data and roll out!

In the vast digital universe, where data flows like Energon, one principle holds true: Data is only as powerful as the form it takes. Like the legendary Autobots, who transform from vehicles into mighty warriors, data must also undergo a transformation to unlock its true potential. Enter the world of Data Transforms—the real-life equivalent of Autobot power, turning raw data into actionable insights and driving businesses to victory in the battle for digital dominance.

If you've heard of Oracle's enterprise grade tool Oracle Data Integrator (ODI) Studio, this is its lightweight cloud-first solution that has a browser based front end and currently exposes a subset of ODI's functionality, but like the cloud is continuing to evolve.

With minimal setup required, and its low code method of use, you can complete your ETL needs within minutes 🤯 Data analysts rejoice, now you can do it all yourself in this low code tool ready and waiting for you...But is it ready for the complexity that Data Engineers need to address? Let's find out...

What's what?

Some of the terminology may be confusing, especially if you have used Oracle Data Integrator Studio! Data Transforms is a cloud orientated derivative were terminology has been changed to reflect its target audience. Don't worry, I won't be a Decepticon and hide from you what everything means! Here is a table of terminology and what their functions are to help:

Oracle Data Integrator Studio Oracle Data Transforms Function
Data Server Connection Represents the physical object where data is stored (e.g., database instances, file servers).
Technology Connection Type Specifies the technology type (e.g., Oracle, IBM DB2).
Physical Schema Schema Represents a database schema, folder, or object storage bucket.
Data Store Data Entity A tabular representation of a data structure, such as database tables or files.
Reverse Engineering Import Data Entity Obtains metadata for a set of objects in a schema.
Model Data Entity Contains the imported objects, defining rules for importing and filtering data.
Project Project Contains transformation design components like mappings, jobs, and workflows.
Mapping Data Flow Defines how data is moved and transformed from one data store to another.
Package Workflow Specifies the sequence of execution for mappings, including failure conditions.
Job/Session Job Represents the execution of a package, mapping, or reverse engineering task.
Schedule Schedule Defines the rules for when a job or session executes.
💡
You may see some things missing from this table, that is because as a easy to use Data ETL tool, it doesn't have the full functionality of its enterprise grade compatriot ODI, such as Load Plans or Knowledge Modules! This won't be forever mind you, and we expect them to be available within the next 12 months. If you need more complex functionality now, you'll need to setup the more complex but capable ODI.

Where it is available:

  1. Available on the Marketplace as Data Integrator: Web Edition.
  2. A Data Transforms instance is registered via an Autonomous Database.
  3. Built into an Autonomous Database under the Database Tools.

Now that you know what, let's roll out and get to the how of Data Transform!

Now comes the easy part, setting up!

Setting Up

  1. Once logged into your Data Transforms tool, you need to identify your Data Sources, via connections.
  2. In connections you will need to input your credentials, either via a Wallet for the Autonomous Database you are planning to transform data to/from, or one of the other many types of connections supported, which you can see here.
  3. You will need to import your Data Entities, which are your tables and views. You can do a mass import of all your tables and views, or use the Mask to get the specific tables you want.
  4. Create a Project, which is where your Data Flows, Workflows etc. will be housed.
  5. With these Data Entities you can create:
    1. Data Load - source to target Extract and Load, removing the transformation aspect for a quick transfer of data.
    2. Data Flow - ETL where you can have between the source and target a transfer of data with your desired transformations, inclusions, exclusions etc.
    3. Workflow - This will need Data Flows to work, but you can essentially arrange a workflow for some or all of your Data Flows to be run in specific orders, or under circumstances of your choosing. For example, if one Data Flow fails, run an alternative one.

Time to Transform!

You now have the knowledge of the hows and whats, you can get your Data Extracted, Transformed and Loaded in no time! But that's not all, as it isn't just as simple as basic ETL, so here are some features I will highlight of Data Transforms:

  • No Code Interface - making it easier for users who may not have technical expertise in coding or complex ETL logic.
  • Simplified Data Flow Management - users can easily define and execute data flows without the need to define complex models or relationships between objects.
  • Built-in Data Cleansing and Enrichment - which include detecting missing values, formatting data, and standardising inputs without requiring custom logic.
  • Seamless integration with Oracle Cloud Infrastructure - it provides out-of-the-box connectors for various Oracle Cloud databases and services (e.g., Oracle Autonomous Database, Oracle Object Storage).
  • Real Time Data Streaming - you can handle this from cloud sources such as Oracle GoldenGate, providing seamless integration for streaming data pipelines.
  • Built-in Scheduling and Monitoring - integrates directly with the OCI monitoring suite, allowing for easier management of job execution and tracking.
  • Pre-built Connectors for Common Data Sources - of which these connectors are optimised for cloud usage, allowing users to quickly ingest and transform data.
  • Automatic Schema Evolution - it adjusts to changes in data structures without manual intervention!
  • In-built Machine Learning and Data Analytics - these capabilities integrate directly within the platform, enabling users to apply ML models during the data transformation process.

Though it has a surprising amount of functionality for a cloud based package, and while it doesn’t have all the amazing functionality that ODI has such as Load Plans, Knowledge Modules or logging at this stage, it can do a lot in a very quick and easy to use way.

In Conclusion

It has ease of use and quick setup on its side, though as a version of ODI, cut down to make it equally at home in the hands of a Data Analyst to those of a Data Engineer, it still cannot handle anything overtly complex at its current stage of functionality.

It is still being improved upon, though currently compared to ODI it is still missing three areas that might be necessary for a Data Engineer:

  1. Load Plans - the ability to have parallelisation and resumability (for example) in the running of your ETL, and what is always essential, the capability to have in depth monitoring of the results.
  2. Automation through scripting - CI/CD use case and automated mapping generation which can make when you need to do a lot of ETL for a large scale operation a lot faster.
  3. Knowledge Modules - If you have these customised to suit your ETL process, then this is quite a large gap that can't be filled by Data Transforms at the moment.

With this in mind, it is still a great tool for those needing to do Data Transformation quick and easy, but for the more complex requirements ODI will still be your prime choice.

So for those of you in need of a quick ETL, now you know of something new, and not just Optimus, but now you get to unleash the power of the matrix...I mean Data Transforms!

You can do both!