kartothek.io_components.cube.stats module¶
-
kartothek.io_components.cube.stats.
collect_stats_block
(metapartitions, store)[source]¶ Gather statistics data for multiple metapartitions.
- Parameters
metapartitions (Tuple[Tuple[str, Tuple[kartothek.io_components.metapartition.MetaPartition, ..]], ..]) – Part of the result of
get_metapartitions_for_stats()
.store (Union[simplekv.KeyValueStore, Callable[[], simplekv.KeyValueStore]]) – KV store.
- Returns
stats – Statistics per ktk_cube dataset ID.
- Return type
-
kartothek.io_components.cube.stats.
get_metapartitions_for_stats
(datasets)[source]¶ Get all metapartitions that need to be scanned to gather cube stats.
- Parameters
datasets (Dict[str, kartothek.core.dataset.DatasetMetadata]) – Datasets that are present.
- Returns
metapartitions – Pre-aligned metapartitions (by primary index / physical partitions) and the ktk_cube dataset ID belonging to them.
- Return type
Tuple[Tuple[str, Tuple[kartothek.io_components.metapartition.MetaPartition, ..]], ..]
-
kartothek.io_components.cube.stats.
reduce_stats
(stats_iter)[source]¶ Sum-up stats data.
- Parameters
stats_iter (Iterable[Dict[str, Dict[str, int]]]) – Iterable of stats objects, either resulting from
collect_stats_block()
or previousreduce_stats()
calls.- Returns
stats – Statistics per ktk_cube dataset ID.
- Return type