
Transformations that allow obtaining star schemas from flat tables.
Starting from a flat
table, a dimensional model is defined specifying the attributes that make
up each of the dimensions and the measurements in the facts. The result is
a dimensional_model
object. It is carried out through the following
dimensional model definition functions:
A star schema is defined from a flat table and a dimensional model definition. Once defined, a star schema can be transformed by defining role playing dimensions, changing the writing style of element names or the type of dimension attributes. These operations are carried out through the following star schema definition and transformation functions:
Based on various star schemas, a constellation can be defined in which star schemas share common dimensions. Dimensions with the same name must be shared. It is defined by the following constellation definition function:
Once the star schemas and constellations are defined, data cleaning operations can be carried out on dimensions. There are three groups of functions: one to obtain dimensions of star schemas and constellations; another to define data cleaning operations over dimensions; and one more to apply operations to star schemas or constellations.
Obtaining dimensions:
Update definition functions:
Modification application functions:
When new data is obtained, an incremental refresh of the data can be carried out, both of the dimensions and of the facts. Incremental refresh can be applied to both star schema and constellation, using the following functions:
Once the data has been properly structured and transformed, it can be exported to be consulted with other tools. Various export formats have been defined, both for star schemas and for constellations, using the following functions:
From flat tables star schemas can be defined that can form constellations (star schema and constellation definition functions). Dimensions contain data without duplicates, operations to do data cleaning can be applied on them (data cleaning functions). When new data is obtained, it is necessary to refresh the existing data with them by means of incremental refresh operations (incremental refresh functions). Finally, the results obtained can be exported to be consulted with other tools (results export functions).