diepvries.data_vault_load module

Module for a Data Vault load.

class diepvries.data_vault_load.DataVaultLoad(extract_schema, extract_table, staging_schema, staging_table, extract_start_timestamp, target_tables, source=None)

Bases: object

Load data in a Data Vault.

__init__(extract_schema, extract_table, staging_schema, staging_table, extract_start_timestamp, target_tables, source=None)

Instantiate a DataVaultLoad object and calculate additional fields.

  • extract_schema (str) – Schema where the extraction table is stored.

  • extract_table (str) – Name of the extraction table.

  • staging_schema (str) – Schema where the staging table should be created.

  • staging_table (str) – Name of the staging table.

  • extract_start_timestamp (datetime) – Moment when the extraction started (when we started fetching data from source).

  • target_tables (List[DataVaultTable]) – Tables that will be populated by current staging table.

  • source (Optional[str]) – Source system/API/database. If source is not passed as argument, the process will assume that a source (field named according to METADATA_FIELDS naming conventions) will exist in target table.


ValueError – When the extract_start_timestamp is not linked to a timezone.


Representation of a DataVaultLoad object as a string.

This helps with the tracking of logging events per entity.

Return type:



String representation of this DataVaultLoad instance.

property sql_load_script: List[str]

Generate the SQL script to load current Data Vault model.

It is a list of SQL commands.

Return type:



SQL script that should be executed to load current Data Vault model - one

entry per table to load.

property staging_create_sql_statement: str

Generate the SQL query to create the staging table.

All needed placeholders are calculated, in order to match template SQL (check template_sql/staging_table_ddl.sql).

Return type:



SQL query to create staging table.

property target_tables: List[DataVaultTable]

Get target tables.

Return type:



List of target tables.