Rittman Mead's Data Dictionary enables organisations to better understand their data, resulting in better insight from and more control over their data estate.
The proliferation of data and analytics systems across organisations means users have more tools, more raw data and more aggregated data at their disposal. While this is fundamentally a good thing, it can lead to some challenges:
This can result in:
The Data Dictionary performs two main functions:
The Data Dictionary provides a catalog where business descriptions, aliases, relationships, and other metadata can be stored against data fields. Data fields may include database columns, columns in CSV files, elements in structured data or attributes in reporting systems.
Often data representing a single business definition is stored in multiple places across an organisation's data estate. Identifying and understanding where this is stored can significantly reduce the time for analysis or discovery and prevent users from continually refactoring or rebuilding data sets.
Organisations typically have complex data estates that have grown organically over many years. There could be many versions of a sales metric, each based on a different calculation, or time window. If users cannot tell them apart, they can easily select the wrong one to generate reports of data pipelines, resulting in incorrect outputs, mistakes, rework and potentially business decisions made on an erroneous basis.
Organisations can use a Data Dictionary to publish certified data sets, and in some cases to provide overlaid description text to reporting systems, so business definitions are always available to users reviewing reports and dashboards.
The Data Dictionary can become an integral part of a data governance initiative.
Firstly the Data Dictionary provides a framework for data governance procedures pertaining to the definition of metadata. Rules can be set up where only authorised users can change specific fields, and approvals are required for changes.
Second, it provides a mechanism for tracking personal data that can, for example, identify individuals and therefore help the organisation meet rules in data protection and GDPR legislation.
Finally, it can load existing metadata definitions and catalogues to ensure that previous data governance efforts, often in Excel, are not wasted, and can now be centralised.