Data contracts specification¶
Following is the template for a data contract:
---
kind: DataContract # (1)
status: draft # (2)
template_version: 0.0.1 # (3)
dataset: sale_txn # (4)
type: Table # (5)
description: This is the ... # (6)
data_source: snowflake # (7)
tags: # (8)
- name: PII
- name: GDPR
certificate: DRAFT # (9)
columns:
- name: txn_ref_dt # (10)
business_name: transaction date # (11)
description: null # (12)
is_primary: false # (13)
data_type: date # (14)
logical_type: date
invalid_format: null # (15)
valid_format: null # (16)
invalid_regex: null # (17)
valid_regex: null # (18)
missing_regex: null # (19)
invalid_values: [] # (20)
valid_values: [] # (21)
missing_values: [] # (22)
not_null: true # (23)
valid_length: null # (24)
valid_max_length: null # (25)
valid_min: null # (26)
valid_max: null # (27)
valid_min_length: null # (28)
unique: false # (29)
reference: # (30)
dataset: iso_3166-2
column: available_dates
samples_limit: 20
quality:
tool_name: DQ Platform
checks: # (31)
- missing_count(txn_ref_dt) = 0
- missing_count(txn_ref_dt) = 100
- current_time - date(record_date) < 5
...
- Must always be
DataContract
. - State of the contract:
draft
: contract is still being defined (work in progress)verified
: contract is published and ready to be used
- Version of the template for the data contract.
- Name of the asset as it exists inside Atlan.
- Type of the asset in Atlan:
Table
: a database tableView
: a database viewMaterialisedView
: a materialized view in a database
-
(Optional) Description of this dataset, for documentation purposes.
Read-only
Treat this as read-only — any changes you make will not be synced back to Atlan.
-
Name that must match a data source defined in your config file.
-
(Optional) List of the names of tags for this dataset, for documentation purposes.
Read-only
Treat this as read-only — any changes you make will not be synced back to Atlan.
-
(Optional) Status of the dataset:
DRAFT
: dataset is still being defined (work in progress)VERIFIED
: dataset is trusted and ready to be usedDEPRECATED
: dataset should no longer be trusted or used
Read-only
Treat this as read-only — any changes you make will not be synced back to Atlan.
-
Name of the column as it is defined in the source system (often technical).
-
(Optional) Alias for the column, to make it's name more readable.
Read-only
Treat this as read-only — any changes you make will not be synced back to Atlan.
-
(Optional) Description of this column, for documentation purposes.
Read-only
Treat this as read-only — any changes you make will not be synced back to Atlan.
-
(Optional) When
true
, this column is the primary key for the table.Read-only
Treat this as read-only — any changes you make will not be synced back to Atlan.
-
Physical data type of values in this column.
- (Optional) Format of data to consider invalid.
- (Optional) Format of data to consider valid.
- (Optional) Regular expression to match invalid values.
- (Optional) Regular expression to match valid values.
- (Optional) Regular expression to match missing values.
- (Optional) Enumeration of values that should be considered invalid.
- (Optional) Enumeration of values that should be considered valid.
- (Optional) Enumeration of values that should be considered missing.
- (Optional) When
true
, this column cannot be empty (without values). - (Optional) Fixed length for a string to be considered valid.
- (Optional) Maximum length for a string to be considered valid.
- (Optional) Minimum numeric value considered valid.
- (Optional) Maximum numeric value considered valid.
- (Optional) Minimum length for a string to be considered valid.
- (Optional) When
true
, this column must have unique values. - (Optional) Values in this column must match the related dataset and column's values.
- (Optional) List of checks to run to verify data quality of this dataset.