kartothek.io.testing.read module

This module is a collection of tests which should be implemented by all kartothek read backends. The tests are not subject to the semantic versioning scheme and may change with minor or even patch releases.

To use the tests of this module, add the following import statement to your test module and ensure that the following fixtures are available in your test environment.

` from kartothek.io.testing.read import *  # noqa `

Fixtures required to be implemented:

  • output_type - One of {dataframe, metpartition, table} to define the outptu type of the returned result.

  • bound_load_dataframes - A callable which will retrieve the partitions in the format specified by output_type. The callable should accept all keyword arguments expected for a kartothek reader.

Source test data

  • dataset - A fixture generating test data (TODO: Expose this as a testing function)

  • store_factory - A function scoped store factory

  • store_session_factory - A session scoped store factory

Feature toggles (optional):

  • custom_read_parameters - Pass additional backend specific kwargs to the read function. The fixture should return a dict which can be passed using the double asterisks syntax to the callable.

The following fixtures should be present (see tests.read.conftest) * use_categoricals - Whether or not the call retrievs categorical data. * dates_as_object - Whether or not the call retrievs date columns as objects. * label_filter - a callable to filter partitions by label.

class kartothek.io.testing.read.NoPickle[source]

Bases: object

kartothek.io.testing.read.custom_read_parameters()[source]
kartothek.io.testing.read.dataset_dispatch_by(metadata_version, store_session_factory, dataset_dispatch_by_uuid)[source]
kartothek.io.testing.read.dataset_dispatch_by_uuid()[source]
kartothek.io.testing.read.dates_as_object(request)[source]
kartothek.io.testing.read.label_filter(request)[source]
kartothek.io.testing.read.load_dataset_metadata(request)[source]
kartothek.io.testing.read.mark_nopickle(obj)[source]
kartothek.io.testing.read.no_pickle_factory(url)[source]
kartothek.io.testing.read.no_pickle_store(url)[source]
kartothek.io.testing.read.store_input_types(request, tmpdir)[source]
kartothek.io.testing.read.test_binary_column_metadata(store_factory, bound_load_dataframes)[source]
kartothek.io.testing.read.test_datetime_predicate_with_dates_as_object(dataset, store_factory, bound_load_dataframes, metadata_version, custom_read_parameters, output_type, partition_on, datetype, comp)[source]
kartothek.io.testing.read.test_empty_predicate_pushdown_empty_col_projection(dataset, store_session_factory, bound_load_dataframes, backend_identifier)[source]
kartothek.io.testing.read.test_extensiondtype_rountrip(store_factory, bound_load_dataframes)[source]
kartothek.io.testing.read.test_read_dataset_alternative_table_name(dataset_alternative_table_name, store_factory, dataset_factory_alternative_table_name, use_dataset_factory, bound_load_dataframes, use_categoricals, output_type, label_filter, dates_as_object, alternative_table_name)[source]
kartothek.io.testing.read.test_read_dataset_as_dataframes(dataset, store_session_factory, dataset_factory, use_dataset_factory, bound_load_dataframes, use_categoricals, output_type, label_filter, dates_as_object)[source]
kartothek.io.testing.read.test_read_dataset_as_dataframes_columns_primary_index_only(store_factory, bound_load_dataframes, metadata_version)[source]
kartothek.io.testing.read.test_read_dataset_as_dataframes_columns_projection(store_factory, bound_load_dataframes, metadata_version)[source]
kartothek.io.testing.read.test_read_dataset_as_dataframes_concat_primary(store_factory, custom_read_parameters, bound_load_dataframes, output_type, metadata_version)[source]
kartothek.io.testing.read.test_read_dataset_as_dataframes_dispatch_by_empty(store_session_factory, dataset_dispatch_by, bound_load_dataframes, backend_identifier, output_type, metadata_version, dataset_dispatch_by_uuid)[source]
kartothek.io.testing.read.test_read_dataset_as_dataframes_dispatch_by_multi_col(store_session_factory, bound_load_dataframes, output_type, dataset_dispatch_by, dataset_dispatch_by_uuid)[source]
kartothek.io.testing.read.test_read_dataset_as_dataframes_dispatch_by_single_col(store_session_factory, dataset_dispatch_by, bound_load_dataframes, backend_identifier, dispatch_by, output_type, metadata_version, dataset_dispatch_by_uuid)[source]
kartothek.io.testing.read.test_read_dataset_as_dataframes_predicate(dataset, store_session_factory, custom_read_parameters, bound_load_dataframes, predicates, output_type, backend_identifier)[source]
kartothek.io.testing.read.test_read_dataset_as_dataframes_predicate_empty(dataset_partition_keys, store_session_factory, custom_read_parameters, output_type, bound_load_dataframes)[source]
kartothek.io.testing.read.test_read_dataset_as_dataframes_predicate_with_partition_keys(dataset_partition_keys, store_session_factory, custom_read_parameters, bound_load_dataframes, predicates, output_type)[source]
kartothek.io.testing.read.test_read_dataset_multi_table_warning(store_factory, metadata_version, bound_load_dataframes)[source]
kartothek.io.testing.read.test_read_dispatch_by_with_predicates(store_session_factory, dataset_dispatch_by_uuid, bound_load_dataframes, dataset_dispatch_by, dispatch_by, output_type, expected_dispatches, predicates)[source]
kartothek.io.testing.read.test_store_input_types(store_input_types, bound_load_dataframes)[source]
kartothek.io.testing.read.use_categoricals(request)[source]
kartothek.io.testing.read.use_dataset_factory(request, dates_as_object)[source]