- Process of creating a new cube.
- A unique combination of Dimension values. Will result in a single row in input and output DataFrames.
- A combination of multiple datasets that model an Data Cubes-like construct. The core data structure of kartothek cube.
- Dataset ID
- The ID of a dataset that belongs to the cube w/o any Uuid Prefix.
- Part of the address for a certain cube Cell. Usually refered as Dimension Column. Different dimension should describe orthogonal attributes.
- Dimension Column
- DataFrame column that contains values for a certain Dimension.
- Dimension Columns
- Ordered list of all Dimension Column for a Cube.
- Process of adding new datasets to an existing cube.
- Index Column
- Column for which additional index structures are build.
- Kartothek Dataset UUID
- Name that makes a dataset unique in a store, includes Uuid Prefix and Dataset ID as
<UUID Prefix>++<Dataset ID>.
- Logical Partition
- Partition that was created by
partition_byarguments to the Query.
- Physical Partition
- A single chunk of data that is stored to the blob store. May contain multiple Parquet files.
- Partition Column
- DataFrame column that contains one part that makes a Physical Partition.
- Partition Columns
- Ordered list of all Partition Column for a Cube.
- Process of dimension reduction of a cube (like a 3D object projects a shadow on the wall). Only works if the involved payload only exists in the subdimensional space since no automatic aggregation is supported.
- Dataset that provides the groundtruth about which Cell are in a Cube.
- Store Factory
- A callable that does not take any arguments and creates a new simplekv store when being called. Its type is
- A request for data from the cube, including things like “payload columns”, “conditions”, and more.
- Query Execution
- Process of reading out data from a Cube, aka the execution of a Query.
- Query Intention
The actual intention of a Query, e.g.:
- if the user queries “all columns”, the intention includes the concrete set of columns
- if the user does not specify the dimension columns, it should use the cube dimension column (aka “no Projection”)
- Uuid Prefix
- Common prefix for all datasets that belong to a Cube.