Python, XML and the OAS RPD

The Oracle Analytics Server (OAS) repository data file (RPD) is a powerful tool for creating and presenting highly functional and structured data models. However, sometimes we are asked to do something that isn't straightforward using the out-of-the-box functionality. In this blog post, we will walk you through a request to make the security information accessible and comparable between different branches of the OAS RPD.

Why extract information from the OAS RPD?

The information is accessible in the RPD but not so easy to extract in a way that is comparable. We certainly don't want to introduce a manual process, and the available out-of-the-box extraction methods don't provide us with the information in the form we need. But there's good news: the RPD is available in XML format, which contains everything available in the binary file.

Converting the RPD to XML format

The files we want to compare are stored in the binary format, so we first need to convert them to XML. This is possible with the biserverxmlgen.cmd command. This is a command line tool available with OAS, you can read about the available tools on this page https://docs.oracle.com/en/middleware/bi/analytics-server/metadata-oas/oracle-bi-server-command-line-utilities.html#GUID-3C552C76-E2C0-4355-AC76-2BFE47907F02.

To export the file.rpd we can use the command line and run:

biserverxmlgen.cmd -R c:\folder\file.rpd -P rpd_password -O c:\folder\file.xml

In this command we pass some values:

  • -R is used to supply the path to the RPD file
  • -P is used to supply the RPD password to the command
  • -O is used to provide the path you would like to store the xml

Understanding the File Format

Now we have a huge XML file containing all of the metadata. This file is well formed but can be a little complex to follow. We’ve spent many years working with this metadata and can help with any requirements you think might be satisfied by this metadata. Be sure to reach out if we can assist you!

For this page we’ll keep it simple and will focus on the parts of interest. Lets start by finding the application roles within the XML. We can search for something known and find how it is held within the file format. Lets check out BIConsumer in the file:

<Group name="BIConsumer" id="4206:58492" uid="40000005-05e6-178f-b61f-0a4015350000">
<RefPrivilegePackage id="4209:58301" uid="000006f9-1071-0000-714b-e31d00000000" qualifiedName="&quot;PrivPack_4209:14031339688967&quot;"/>
</Group>

We can see that it is held in the Group tag. So we want to create a list of application roles from this element. But there is nothing in the Group tag that tells us about objects so we need to identify the relationship. RefPrivilegePackage looks interesting, lets follow this and see where we end up. The uid stands for unique identifier and searching for that leads us to:

<PrivilegePackage name="PrivPack_4209:14031339688967" id="4209:58301" uid="000006f9-1071-0000-714b-e31d00000000">
<ObjectPrivileges>
...
</ObjectPrivileges>
</PrivilegePackage>

This is interesting we can see we found the PrivilegePackage for the application role and this contains the ObjectPrivileges. It’s simply a case of rinse and repeat until you’ve built up an understanding of the relationships.

That is all very well but isn’t this even more complex than reading through the Administration Tool! Well, yes, it is at this point. But we have exposed the metadata as text and identified a pattern to extract the bits of information we need. Automating this is a case of capturing that pattern into a script. The data engineering team at Rittman Mead are rather fond of Python for these types of tasks. It’s such a versatile language to quickly automate tasks.

Using Python for Automation

Python provides us with a few very useful libraries we can take advantage of, first of all we want to convert the RPD into XML. We can run the OAS command line tool from Python using a library called subprocess. Then we want to analyse the XML to extract the bits of information we need, Python provides us with a library called xml. Lets bring all the information we know about the steps and the XML structure and produce a script.

import sys, subprocess
import xml.etree.ElementTree as ET

rpd_path = sys.argv[1]
password = sys.argv[2]
xml_path = sys.argv[3]
out_path = sys.argv[4]
obiee_xmlgen = r'C:\oas_client_install_path\bi\bitools\bin\biserverxmlgen.cmd'

subprocess.call([obiee_xmlgen, '-R', rpd_path, '-P', password, '-O', xml_path])

tree = ET.parse(xml_path)
root = tree.getroot()

xml_groups = root.findall("*/Group")

groups = []

csv_groups = []

for xml_group in xml_groups:
    group = {}
    group['type'] = 'Group'
    group['name'] = xml_group.attrib['name']
    group['id'] = xml_group.attrib['id']
    group['children'] = []
    
    xml_refs = xml_group.findall('RefPrivilegePackage')
    
    for xml_ref in xml_refs:
        ref = {}
        ref['type'] = 'RefPrivilegePackage'
        ref['qualifiedName'] = xml_ref.attrib['qualifiedName']
        ref['id'] = xml_ref.attrib['id']
        ref['children'] = []
        
        xml_privspacks = root.findall(f"*/PrivilegePackage[@id='{ref['id']}']")
        
        for xml_privspack in xml_privspacks:
            privspack = {}
            privspack['type'] = 'PrivilegePackage'
            privspack['name'] = xml_privspack.attrib['name']
            privspack['id'] = xml_privspack.attrib['id']
            privspack['children'] = []
            
            xml_refobjprivs = xml_privspack.findall('*/RefObjectPrivilege')
            
            for xml_refobjpriv in xml_refobjprivs:
                refobjpriv = {}
                refobjpriv['type'] = 'RefObjectPrivilege'
                refobjpriv['qualifiedName'] = xml_refobjpriv.attrib['qualifiedName']
                refobjpriv['id'] = xml_refobjpriv.attrib['id']
                refobjpriv['children'] = []
                
                xml_objprivs = root.findall(f"*/ObjectPrivilege[@id='{refobjpriv['id']}']")
                
                for xml_objpriv in xml_objprivs:
                    objpriv = {}
                    objpriv['type'] = 'ObjectPrivilege'
                    objpriv['name'] = xml_objpriv.attrib['name']
                    objpriv['id'] = xml_objpriv.attrib['id']
                    objpriv['children'] = []
            
                    xml_refobjs = xml_objpriv.findall("*/RefObject")
            
                    for xml_refobj in xml_refobjs:
                        refobj = {}
                        refobj['type'] = 'RefObject'
                        refobj['qualifiedName'] = xml_refobj.attrib['qualifiedName']
                        refobj['id'] = xml_refobj.attrib['id']
                        refobj['children'] = [] 
                        objpriv['children'].append(refobj)
                        
                        csv_groups.append([group['name'], refobj['qualifiedName']])
                
                    refobjpriv['children'].append(objpriv)

                privspack['children'].append(refobjpriv)
            
            ref['children'].append(privspack)
        
        group['children'].append(ref)
        
    groups.append(group)
    
with open(out_path, 'w') as csv:
    csv.write('Group,Object,\n')
    for row in csv_groups:
        for cell in row:
            csv.write(cell)
            csv.write(',')
        csv.write('\n')

This script imports three modules: sys, subprocess, and xml.etree.ElementTree. It also defines four variables: rpd_path, password, xml_path, and out_path all of these variables are assigned a value from parameters passed on the command line.

Note that on line 8 you need to provide the location of your OAS client installation. This is so the script can locate the command line tool.

The script then calls the biserverxmlgen command line tool using the subprocess module, passing in the three of the variables as arguments to generate an XML file from an Oracle BI repository file.

The script then uses the ElementTree module to parse the generated XML file and obtain the root element.

Next, the script extracts all groups from the XML file and stores them in a list. For each group, the script extracts all referenced privilege packages and their associated object privileges and object references.

The script then constructs a nested dictionary for each group, privilege package, object privilege, and object reference, with each dictionary containing a "type" key and a "children" key. The "type" key contains the type of the dictionary element, such as "Group" or "ObjectPrivilege," and the "children" key contains a list of child elements.

The script also extracts the name and ID of each group, privilege package, object privilege, and object reference and stores them in their respective dictionaries.

Finally, the script writes the name of each group and object reference to a CSV file specified by the out_path variable. The CSV file has two columns: "Group" and "Object".

Conclusion

All of this work for just two columns! Of course, the beauty and value of getting those two columns is that it now can be integrated into a CICD pipeline. The CICD pipeline can be used to monitor changes in the RPD security. Any changes can be flagged to the team for review. Sometimes we can achieve great things with small bits of information.