kartothek.io.testing.append_cube module¶
-
kartothek.io.testing.append_cube.
test_append_partitions
(driver, function_store, existing_cube)[source]¶
-
kartothek.io.testing.append_cube.
test_compression_is_compatible_on_append_cube
(driver, function_store)[source]¶ Test that partitons written with different compression algorithms are compatible
The compression algorithms are not parametrized because their availability depends on the arrow build. ‘SNAPPY’ and ‘GZIP’ are already assumed to be available in parts of the code. A fully parametrized test would also increase runtime and test complexity unnecessarily.
-
kartothek.io.testing.append_cube.
test_fails_incompatible_dtypes
(driver, function_store, existing_cube)[source]¶ Should also cross check w/ seed dataset.
-
kartothek.io.testing.append_cube.
test_fails_missing_column
(driver, function_store, existing_cube)[source]¶
-
kartothek.io.testing.append_cube.
test_fails_unknown_dataset
(driver, function_store, existing_cube)[source]¶
-
kartothek.io.testing.append_cube.
test_metadata
(driver, function_store, existing_cube)[source]¶ Test auto- and user-generated metadata.
-
kartothek.io.testing.append_cube.
test_rowgroups_are_applied_when_df_serializer_is_passed_to_append_cube
(driver, function_store, chunk_size_build, chunk_size_append)[source]¶ Test that the dataset is split into row groups depending on the chunk size
Partitions build with
chunk_size=None
should keep a single row group after the append. Partitions that are newly created withchunk_size>0
should be split into row groups accordingly.