KeyValueStore InterfaceΒΆ

All storage interaction use simplekv.KeyValueStore as an storage layer abstraction. This allows convenient access to many different common Key-Value stores (ABS, S3, GCS, local filesystem, etc.) and allows an easy switch between the storage backends to facilitate a simpler test setup.

Generally, all of our public functions accepting a store argument accept a multitude of different input types and we generally accept all kinds of stores inheriting from KeyValueStore, assuming they implement the pickle protocol. However, there are storages which simply cannot be distributed across processes or network nodes sensibly. A prime Example is the simplekv.memory.DictStore which uses a simple python dictionary as a backend store. It is technically possible to (de-)serialize the store but once it is deserialized in another process, or another node, the store looses its meaning since the stores are isolated per process, node, etc. Kartothek does not verify semantics of a given store but only verifies whether or not the store implements the pickle protocol.

For all cases where the KeyValueStore does not implement the pickle protocol, or some more complex logic is required to initialize it, kartothek also accepts _factories_ which must be a callable returning a KeyValueStore (see also kartothek.core.typing.StoreFactory).

For convenience we also offer a storefact integration and accept store urls which proves another easy level of access and is well suited for ad-hoc investigations.