Background

Data Dictionary

Rittman Mead's Data Dictionary enables organisations to better understand their data, resulting in better insight from and more control over their data estate.

Background

The proliferation of data and analytics systems across organisations means users have more tools, more raw data and more aggregated data at their disposal. While this is fundamentally a good thing, it can lead to some challenges:

  • users are unsure of which attributes and metrics to use;
  • users repeatedly rework cleansing, joining and transformation operations.

This can result in:

  • the generation of invalid analysis and reports through the use of incorrect underlying data, or by incorrectly joining and transforming the underlying data
  • time wasted by the repetition of the same operation
  • the organisation having very little governance and control of their data estate.
Background

Objectives

Background

The Data Dictionary performs two main functions:

  • It permits better exploration and analysis of an organisation's data by its users; and
  • It provides a level of governance around the use of information across an organisation.

Exploration

The Data Dictionary provides a catalog where business descriptions, aliases, relationships, and other metadata can be stored against data fields. Data fields may include database columns, columns in CSV files, elements in structured data or attributes in reporting systems.

Often data representing a single business definition is stored in multiple places across an organisation's data estate. Identifying and understanding where this is stored can significantly reduce the time for analysis or discovery and prevent users from continually refactoring or rebuilding data sets.

Organisations typically have complex data estates that have grown organically over many years. There could be many versions of a sales metric, each based on a different calculation, or time window. If users cannot tell them apart, they can easily select the wrong one to generate reports of data pipelines, resulting in incorrect outputs, mistakes, rework and potentially business decisions made on an erroneous basis.

Organisations can use a Data Dictionary to publish certified data sets, and in some cases to provide overlaid description text to reporting systems, so business definitions are always available to users reviewing reports and dashboards.

Explore

Governance

Governance

The Data Dictionary can become an integral part of a data governance initiative.

Firstly the Data Dictionary provides a framework for data governance procedures pertaining to the definition of metadata. Rules can be set up where only authorised users can change specific fields, and approvals are required for changes.

Second, it provides a mechanism for tracking personal data that can, for example, identify individuals and therefore help the organisation meet rules in data protection and GDPR legislation.

Finally, it can load existing metadata definitions and catalogues to ensure that previous data governance efforts, often in Excel, are not wasted, and can now be centralised.

Benefits

  • A better understanding of the data stored in your data warehouse.
  • A reduction in change requests and the development of duplicate items.
  • Clear understanding of lineage between your database and reports.

Governance