Table Types

  • Extraction table: A table storing all data extracted from a source. This is the input for the framework.

  • Staging table: The table created from the extraction table and containing all DV specific fields used to load every target table (hubs, links and satellites).

  • Hub: Table that keeps record of all distinct business keys of a specific entity.

  • Link: Table that keeps record of all distinct relationships between one or more hubs.

  • Satellite: Table that contains all properties of a link or a satellite. The data in this table is totally historized. Each row has a start and end timestamp. If a change is detected, a new record is inserted and the start and end timestamps adjusted accordingly.

    • Effectivity Satellite: A special type of link satellite, that defines which link relationship is active at a given point in time, based on a subset of link fields.

Field Types

  • Business key: Operational identifier of an entity.

  • Child key: A field that exists in the source system, is not a business key, but is used to define a relationship between two entities.

  • Hashkey: generated identifier, equivalent to a hashed version of the table business keys.

  • Hashdiff: generated field, equivalent to a data row identifier (hashed version of all business keys + child_keys + descriptive field). This field is used to identify which records were changed between loads.

  • Metadata: Fields that represent Data Vault specific metadata: for now, the three fields defined as metadata are the following:

    • r_timestamp: start timestamp of a record.

    • r_timestamp_end: end timestamp of a record (only applicable to satellites).

    • r_source: source system (API/database/etc).