kartothek.io.testing.build_cube module¶
-
kartothek.io.testing.build_cube.
test_accept_projected_duplicates
(driver, function_store)[source]¶ Otherwise partitioning does not work w/ projected data.
-
kartothek.io.testing.build_cube.
test_distinct_branches
(driver, function_store)[source]¶ Just check this actually works.
-
kartothek.io.testing.build_cube.
test_do_not_modify_df
(driver, function_store)[source]¶ Functions should not modify their inputs.
-
kartothek.io.testing.build_cube.
test_empty_df
(driver, function_store, empty_first)[source]¶ Might happen during DB queries.
-
kartothek.io.testing.build_cube.
test_fail_all_empty
(driver, driver_name, function_store)[source]¶ Might happen due to DB-based filters.
-
kartothek.io.testing.build_cube.
test_fail_duplicates_global
(driver_name, driver, function_store)[source]¶ Might happen due to bugs.
-
kartothek.io.testing.build_cube.
test_fail_duplicates_local
(driver, driver_name, function_store)[source]¶ Might happen during DB queries.
-
kartothek.io.testing.build_cube.
test_fail_no_store_factory
(driver, function_store, skip_eager)[source]¶
-
kartothek.io.testing.build_cube.
test_fail_nondistinc_payload
(driver, function_store)[source]¶ This would lead to problems during the query phase.
-
kartothek.io.testing.build_cube.
test_fail_not_a_df
(driver, function_store)[source]¶ Pass some weird objects in.
-
kartothek.io.testing.build_cube.
test_fail_partial_build
(driver, function_store)[source]¶ Either overwrite all or no datasets.
-
kartothek.io.testing.build_cube.
test_fail_partial_overwrite
(driver, function_store)[source]¶ Either overwrite all or no datasets.
-
kartothek.io.testing.build_cube.
test_fail_partition_on_nondistinc_payload
(driver, function_store)[source]¶ This would lead to problems during the query phase.
-
kartothek.io.testing.build_cube.
test_fail_sparse
(driver, driver_name, function_store)[source]¶ Ensure that sparse dataframes are rejected.
-
kartothek.io.testing.build_cube.
test_fail_wrong_dataset_ids
(driver, function_store, skip_eager, driver_name)[source]¶
-
kartothek.io.testing.build_cube.
test_fail_wrong_types
(driver, function_store)[source]¶ Might catch nasty pandas and other type bugs.
-
kartothek.io.testing.build_cube.
test_fails_duplicate_columns
(driver, function_store, driver_name)[source]¶ Catch weird pandas behavior.
-
kartothek.io.testing.build_cube.
test_fails_metadata_nested_wrong_type
(driver, function_store)[source]¶
-
kartothek.io.testing.build_cube.
test_fails_missing_dimension_columns
(driver, function_store)[source]¶ Ensure that we catch missing dimension columns early.
-
kartothek.io.testing.build_cube.
test_fails_missing_partition_columns
(driver, function_store)[source]¶ Just make the Kartothek error nicer.
-
kartothek.io.testing.build_cube.
test_fails_missing_seed
(driver, function_store)[source]¶ A cube must contain its seed dataset, check this constraint as early as possible.
-
kartothek.io.testing.build_cube.
test_fails_no_dimension_columns
(driver, function_store)[source]¶ Ensure that we catch missing dimension columns early.
-
kartothek.io.testing.build_cube.
test_fails_null_dimension
(driver, function_store)[source]¶ Since we do not allow NULL values in queries, it should be banned from dimension columns in the first place.
-
kartothek.io.testing.build_cube.
test_fails_null_index
(driver, function_store)[source]¶ Since we do not allow NULL values in queries, it should be banned from index columns in the first place.
-
kartothek.io.testing.build_cube.
test_fails_null_partition
(driver, function_store)[source]¶ Since we do not allow NULL values in queries, it should be banned from partition columns in the first place.
-
kartothek.io.testing.build_cube.
test_fails_projected_duplicates
(driver, driver_name, function_store)[source]¶ Test if duplicate check also works w/ projected data. (was a regression)
-
kartothek.io.testing.build_cube.
test_indices
(driver, function_store)[source]¶ Test that index structures are created correctly.
-
kartothek.io.testing.build_cube.
test_metadata
(driver, function_store)[source]¶ Test auto- and user-generated metadata.
-
kartothek.io.testing.build_cube.
test_nones
(driver, function_store, none_first, driver_name)[source]¶ Test what happens if user passes None to ktk_cube.
-
kartothek.io.testing.build_cube.
test_overwrite
(driver, function_store)[source]¶ Test overwrite behavior aka call the build function if the cube already exists.
-
kartothek.io.testing.build_cube.
test_overwrite_rollback_ktk
(driver, function_store)[source]¶ Checks that require a rollback (like overlapping columns) should recover the former state correctly.
-
kartothek.io.testing.build_cube.
test_overwrite_rollback_ktk_cube
(driver, function_store)[source]¶ Checks that require a rollback (like overlapping columns) should recover the former state correctly.
-
kartothek.io.testing.build_cube.
test_parquet
(driver, function_store)[source]¶ Ensure the parquet files we generate are properly normalized.
-
kartothek.io.testing.build_cube.
test_projected_data
(driver, function_store)[source]¶ Projected dataset (useful for de-duplication).
-
kartothek.io.testing.build_cube.
test_regression_pseudo_duplicates
(driver, function_store)[source]¶ Might happen due to bugs.
-
kartothek.io.testing.build_cube.
test_rowgroups_are_applied_when_df_serializer_is_passed_to_build_cube
(driver, function_store, chunk_size)[source]¶ Test that the dataset is split into row groups depending on the chunk size
-
kartothek.io.testing.build_cube.
test_simple_seed_only
(driver, function_store)[source]¶ Simple integration test w/ a seed dataset only. This is the most simple way to create a cube.
-
kartothek.io.testing.build_cube.
test_simple_two_datasets
(driver, function_store)[source]¶ Simple intergration test w/ 2 datasets.