Naming conventions

diepvries relies on naming conventions for tables and fields. This brings many advantages, such as a simple system to easily organize and filter hundreds of tables, and the ease of string manipulations to generate queries.

Tables

Hubs

Hubs are prefixed with h_ and are named after the entity they represent: h_<entity>. Conventionally, entity names are singular.

Role-playing hubs are implemented as views on top of the physical hub, and are suffixed with what they represent.

Satellites

Hub satellites are prefixed with hs_ and are named after the hub they link to, with an optional suffix: hs_<entity>[_suffix].

Using suffix is a recommended practice, for traceability purposes, as well as for distinguishing multiple satellites linked to the same hub.

Link satellites are prefixed with ls_ and are named after the link they link to, with an optional suffix: ls_<entity_1>_<entity_2>...<entity_n>[_suffix].

The same suffix recommendations apply for hub and link satellites.

Finally, effectivity satellites are suffixed with _eff.

Fields

Fields also follow naming conventions. Moreover, they are always ordered in the same way: hashkeys, business keys, child keys, metadata fields, descriptive fields.

Hashkeys

The hashkey of an entity is named after the table, and suffixed with _hashkey: <table_name>_hashkey.

Links and satellites reference hashkeys from different parent entities (as foreign keys), sharing the same hashkey name.

Business keys

The business key of an entity is named after the entity, and suffixed with _id: <entity>_id.

In the case of multiple business keys for a single entity, additional words should be used: <entity>_<business_key_name>_id.

Child keys

The child keys of an entity are named after their field names, and prefixed with ck_: ck_<field_name>.

Child keys are optional.

Metadata fields

Multiple metadata fields are in use for various Data Vault entities. They are all prefixed with r_, except for s_hashdiff, which is a bit special.

  • r_timestamp: Used in every entity, it represents the record creation timestamp.

  • r_timestamp_end: Used in satellites, it represents the record end of validity timestamp.

  • r_source: Used in every entity, it stores the source of the record, i.e. from where it was captured.

  • s_hashdiff: Used in satellites, it stores the hashdiff of the record.

Descriptive fields

Descriptive fields are all the other fields. Their names should be the same as the source field names, except if a name conflicts with an SQL reserved keyword, or one of the fields described above.