kartothek.io_components.write module

kartothek.io_components.write.persist_common_metadata(partition_list, update_dataset, store, dataset_uuid)[source]
kartothek.io_components.write.persist_indices(store: Union[str, simplekv.KeyValueStore, Callable[], simplekv.KeyValueStore]], dataset_uuid: str, indices: Dict[str, kartothek.core.index.IndexBase])Dict[str, str][source]
kartothek.io_components.write.raise_if_dataset_exists(dataset_uuid, store)[source]
kartothek.io_components.write.store_dataset_from_partitions(partition_list, store: Union[str, simplekv.KeyValueStore, Callable[], simplekv.KeyValueStore]], dataset_uuid, dataset_metadata=None, metadata_merger=None, update_dataset=None, remove_partitions=None, metadata_storage_format='json')[source]
kartothek.io_components.write.update_indices(dataset_builder, store, add_partitions, remove_partitions)[source]
kartothek.io_components.write.update_metadata(dataset_builder, metadata_merger, add_partitions, dataset_metadata)[source]
kartothek.io_components.write.update_partitions(dataset_builder, add_partitions, remove_partitions)[source]
kartothek.io_components.write.write_partition(partition_df: Union[Dict, pandas.core.frame.DataFrame, Sequence, kartothek.io_components.metapartition.MetaPartition], secondary_indices: Optional[Union[Literal[False], List[str]]], sort_partitions_by: Optional[Union[str, Sequence[str]]], dataset_uuid: str, partition_on: Optional[Union[str, Sequence[str]]], store_factory: Callable[], simplekv.KeyValueStore], df_serializer: Optional[kartothek.serialization._generic.DataFrameSerializer], metadata_version: int, dataset_table_name: Optional[str] = None)kartothek.io_components.metapartition.MetaPartition[source]

Write a dataframe to store, performing all necessary preprocessing tasks like partitioning, bucketing (NotImplemented), indexing, etc. in the correct order.