Glossary¶
Table Types¶
Extraction table: A table storing all data extracted from a source. This is the input for the framework.
Staging table: The table created from the extraction table and containing all DV specific fields used to load every target table (hubs, links and satellites).
Hub: Table that keeps record of all distinct business keys of a specific entity.
Link: Table that keeps record of all distinct relationships between one or more hubs.
Satellite: Table that contains all properties of a link or a satellite. The data in this table is totally historized. Each row has a start and end timestamp. If a change is detected, a new record is inserted and the start and end timestamps adjusted accordingly.
Effectivity Satellite: A special type of link satellite, that defines which link relationship is active at a given point in time, based on a subset of link fields.
Field Types¶
Business key: Operational identifier of an entity.
Child key: A field that exists in the source system, is not a business key, but is used to define a relationship between two entities.
Hashkey: generated identifier, equivalent to a hashed version of the table business keys.
Hashdiff: generated field, equivalent to a data row identifier (hashed version of all business keys + child_keys + descriptive field). This field is used to identify which records were changed between loads.
Metadata: Fields that represent Data Vault specific metadata: for now, the three fields defined as metadata are the following:
r_timestamp
: start timestamp of a record.r_timestamp_end
: end timestamp of a record (only applicable to satellites).r_source
: source system (API/database/etc).