kartothek - unified metadata for datasets¶
|Date:||Jun 18, 2019|
Datasets are a collection of files with the same schema that reside in
kartothek offers a metadata definition to handle these datasets
efficiently. In addition, the
kartothek.io module provides building
blocks to create and modify these datasets. Handling of I/O, tracking of
dataset partitions and selecting subsets of data are handled transparently.
What is a (real) Kartothek?¶
A Kartothek (or more modern: Zettelkasten/Katalogkasten) is a tool to organize (high-level) information extracted from a source of information.
- Getting started
- Partition Indices
- Type System
- In- / Ouptut
- DataFrame Serialization
- Module Reference