The augmented medallion architecture - Data Quality layer

"Data Quality" layer

Published on : April 15, 2025

|   Lastly edited on : April 15, 2025

|   7 minutes read

Lirav DUVSHANI

Lirav DUVSHANI


The Data Quality layer - Where all data anomalies can be tracked

Data Quality Overview

The main purpose of this layer is to have a dedicated layer to provide data quality issues detection and correction features.


Characteristics of the layer

  • Data Validation & Cleaning
    • This layer provides the data for data validation and cleaning
  • At the hands of data stewards and owners
    • This separate layer has a dedicated governance strategy and is managed by data stewards and owners
  • Allowed Data Structures
    • Structured
  • Granularity of data
    • The data can be detailed and/or aggregated
  • Useful metadata information
    • The user/team responsible for the data quality check
    • The detailed information of the check

Good practices

The data quality layer should provide extensive information about the checks done and the results to enable clear and easy access by data stewards and owners.


Keeping track of all the checks done in the past, enable tracking of the level of quality of data. KPIs like a Validity Score can be tracked to follow the progress done on the data quality.

Examples

Example of a data quality check result - From Silver

Here is an example of a data quality check result on Purchase Order cleaning rule. As you'll see the order in the Silver table have been cleaned. And most of the cleaning operations are technical and don't require to notify anyone.


But two rules need to be notified to the owner of the source :

  • The defaulting to 0 quantity on an order without an order unit of measure (Default Check Status : "ERROR")
  • No cost amount for a paid order (Default Check Status : "WARNING")

For this example, if this happens, we consider that the responsible team is the Purchasing team.


Silver Table - Orders after cleaning


Purchase Order IDCustomer NameProduct NameOrder StatusOrder Unit Of MeasureOrder QuantityCost CurrencyAmount
PO-1Customer 1Product APaidPC3EUR0
(Issue as the purchase order has been paid)
PO-2Customer 2Product CReceivedPC1002USD35,99
PO-3Customer 2Product CPlanned#
(defaulted from ∅)
0
(Cleaned from 30)
GBP20,00
PO-4Customer 1Product APlanned#
(defaulted from ∅)
0
(defaulted from ∅)
EUR0 (defaulted from ∅)

Data Quality Table - Order Data Quality Checks


Purchase Order IDOrder StatusCheckCheck StatusCheck OwnerValue in source
PO-1PaidNo cost amount although purchase order has been paidWARNINGPurchase Team0
PO-3PlannedDefaulting of order quantity
when no order unit of measure
ERRORPurchase Team30

Here, the data quality table contains the checks in error, with all relevant information about the purchase order and about the check status and responsible.

Example of a data quality check result - From Gold

Here is an example of a data quality check result on aggregated tables.


This example has 2 Gold tables both proving the revenue by customer from 2 different sources. The first one is the ERP system and the second one is the financial consolidation system.
The data quality check here is to ensure consistency of the revenue KPI between both systems.

For this example, if this happens, we consider that the responsible team is the Finance team and the check status is "ERROR".


Gold Table - Revenue from ERP


MonthCustomer NameCompany CurrencyCompany Amount
2025-01Customer 1EUR10,02
2025-01Customer 2EUR88,10
2025-01Customer 3EUR152,00
2025-02Customer 2EUR124,05
2025-02Customer 1EUR96,36

Gold Table - Revenue from Finance Consolidation


MonthCustomer NameCompany CurrencyCompany Amount
2025-01Customer 2EUR88,10
2025-01Customer 3EUR152,00
2025-02Customer 2EUR125,05
2025-02Customer 1EUR96,36

Data Quality Table - Revenue Consistency between ERP and Finance Consolidation


MonthCustomer NameCheckDetailled ErrorCheck StatusCheck OwnerValue Difference
2025-01Customer 1Revenue Consistency CheckCustomer is not available in the finance consolidationERRORFinance Team10,02
2025-02Customer 2Revenue Consistency CheckDifference amount between ERP and finance consolidationERRORFinance Team1

Here, the data quality table contains the checks in error, with all relevant information about the purchase order and about the check status and responsible.




This article is part of a series of article describing the augmented medallion architecture.