diepvries.data_vault_load module

Module for a Data Vault load.

class diepvries.data_vault_load.DataVaultLoad(extract_schema, extract_table, staging_schema, staging_table, extract_start_timestamp, target_tables, source=None)

Bases: object

Load data in a Data Vault.

__init__(extract_schema, extract_table, staging_schema, staging_table, extract_start_timestamp, target_tables, source=None)

Instantiate a DataVaultLoad object and calculate additional fields.

Parameters:
  • extract_schema (str) – Schema where the extraction table is stored.

  • extract_table (str) – Name of the extraction table.

  • staging_schema (str) – Schema where the staging table should be created.

  • staging_table (str) – Name of the staging table.

  • extract_start_timestamp (datetime) – Moment when the extraction started (when we started fetching data from source).

  • target_tables (List[DataVaultTable]) – Tables that will be populated by current staging table.

  • source (Optional[str]) – Source system/API/database. If source is not passed as argument, the process will assume that a source (field named according to METADATA_FIELDS naming conventions) will exist in target table.

Raises:

ValueError – When the extract_start_timestamp is not linked to a timezone.

__str__()

Representation of a DataVaultLoad object as a string.

This helps with the tracking of logging events per entity.

Return type:

str

Returns:

String representation of this DataVaultLoad instance.

property sql_load_script: List[str]

Generate the SQL script to load current Data Vault model.

It is a list of SQL commands.

Returns:

SQL script that should be executed to load current Data Vault model - one

entry per table to load.

property sql_load_scripts_by_group: List[List[str]]

Generate the SQL scripts to load current Data Vault model.

Scripts are grouped by their loading order. Within a group, queries can be run in parallel.

property staging_create_sql_statement: str

Generate the SQL query to create the staging table.

All needed placeholders are calculated, in order to match template SQL (check template_sql/staging_table_ddl.sql).

Returns:

SQL query to create staging table.

property target_tables: List[DataVaultTable]

Get target tables.

Returns:

List of target tables.