QA/QC Components

Note

In object-oriented programming, the open–closed principle states software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification. Such an entity can allow its behaviour to be extended without modifying its source code. See also: https://en.wikipedia.org/wiki/Open%E2%80%93closed_principle

The QA/QC components combine base components to complex components. They cache the results and have a specific behavior. To ensure that new QA/QC components can be added, they all inherit from the abstract base class autom8qc.qaqc.base.QAQCComponent. The abstract class has the abstract method perform and the property results.

TestManager

../_images/manager_design.svg
class autom8qc.qaqc.base.TestManager(data, test, mapper=None, pre_func=None, post_func=None, prob_rule=None, mapped_rule=None, filter_options=None)

Bases: autom8qc.qaqc.base.QAQCComponent

A TestManager simplifies the execution of a test and caches the results. Each manager expects an instance of the class BaseStructure to handle the data in standardized way. Moreover, you have to define the test that should be executed (e.g., Global Minimum Test). Optional, you can specify a mapper that maps the probabilities to other values (e.g., validities). You can also define a pre-processing function and a post-processing function that will be applied before the execution of the test (e.g., linear interpolation) or on the results (e.g., filling gaps). In addition, you can also define rules that will be applied on the probabilities and mapped values.

Execution Pipeline:
  1. Preprocessing (optional)

  2. Perform test (required)

  3. Apply rule on the probabilities (optional)

  4. Map the probabilties to another domain (optional)

  5. Apply rule on the mapped values (optional)

  6. Postprocessing (optional)

Parameters
  • data (BaseStructure) – Data (e.g., time series)

  • test (QAQCTest) – Test

  • mapper (BaseMapper) – Mapper to map the probabilities

  • probabilities (pd.Series) – Probabilities

  • mapped_values (pd.Series) – Mapped values

  • pre_function (BaseFunction) – Function that will be applied before the test

  • post_function (BaseFunction) – Function that will be applied after the test

  • prob_rule (BaseRule) – Rule that will be applied on the probabilities

  • mapped_rule (BaseRule) – Rule that will be applied on the mapped values

  • filter_options (dict) – Options to filter the data

autom8qc.qaqc.base.TestManager.perform(self)

Performs the mapper and returns the results.

Important

The results of the execution will be cached. The manager checks if cached results already exists. If so, the results will be returned without performing the pipeline.

Returns

Probabilities or mapped values

Return type

pd.Series

TestSequence

../_images/sequence_design.svg
class autom8qc.qaqc.base.TestSequence(data, mapper=None, pre_func=None, post_func=None, prob_rule=None, mapped_rule=None, measure=None, filter_options=None)

Bases: autom8qc.qaqc.base.QAQCComponent

A TestSequence allows you to create a sequence of several tests that are based on each other. If a data point failed a test, it won’t pass to the next test. A data point failed the test, if the probability is lower than the defined threshold. With this approach, you can ensure that invalid data points don’t affect the next test. Especially, if you use tests that consider the global range of the data (e.g., autom8qc.qaqc.outlier.LOFTest) it’s highly recommended to use this data structure.

Important

If you don’t pass a measure to the constructor, the autom8qc.measures.probabilities.WorstProbabilityMeasure will be used.

Warning

A test group handles the QA/QC tests isolated and combines the results of the tests. If your tests are not depending on each other, you need to use a TestGroup.

Execution Pipeline:
  1. Preprocessing (optional)

  2. Perform tests (required)

  3. Apply rule on the probabilities (optional)

  4. Map the probabilties to another domain (optional)

  5. Apply rule on the mapped values (optional)

  6. Postprocessing (optional)

Parameters
  • data (BaseStructure) – Data that shall be tested

  • mapper (BaseMapper) – Mapper to map the final results

  • pre_function (BaseFunction) – Function that will be applied before the sequence

  • post_function (BaseFunction) – Function that will be applied after the sequence

  • prob_rule (BaseRule) – Rule that will be applied on the probabilities

  • mapped_rule (BaseRule) – Rule that will be applied on the mapped values

  • measure (BaseMeasure) – Measure for the total result

  • filter_options (dict) – Options to filter the data

autom8qc.qaqc.base.TestSequence.perform(self)

Performs the sequence and returns the results. If the probabilities will be mapped, then the mapped values will return.

Important

The results of the execution will be cached. The sequence checks if cached results already exists. If so, the results will be returned without performing the pipeline.

Raises

NoItemsExist – Sequence does not contain any test

Returns

Probabilities or mapped values

Return type

pd.Series

autom8qc.qaqc.base.TestSequence.add_test(self, test, threshold, name, stage=None, weight=1)

Adds the test to the sequence. Each test needs a threshold (between 0 and 1) to filter the good values. If a probability is lower than the defined threshold, it won’t pass to the next test. Optional, you can use the parameter stage to set the position of test. If the stage is not set, the test will be appended.

Warning

The sequence is zero-based. If you want to add a new test at the first position, you have to pass 0 for the stage.

Raises
Parameters
  • test (QAQCTest) – Test that should be performed

  • threshold (float) – Threshold to filter the good points (0, 1)

  • name (str) – Name of the test

  • stage (int) – Stage (position) of the test (optionally)

  • weight (int) – Weight of the test

Returns

None

Return type

None

TestGroup

../_images/group_design.svg
class autom8qc.qaqc.base.TestGroup(measure, mapper=None, post_func=None, prob_rule=None, mapped_rule=None)

Bases: autom8qc.qaqc.base.QAQCComponent

A TestGroup gives you the possibility to perform several tests isolated and merge the results with a measure. In comparison to a sequence or a manager, a test group isn’t bound to any data. It only performs the items (which are bound to data) and merges the results.

Warning

A test group handles the QA/QC tests isolated and combines the results of the tests. If you want to create a sequence in that QA/QC tests depend on each other, you have to use TestSequence.

Execution Pipeline:
  1. Perform tests (required)

  2. Apply rule on the probabilities (optional)

  3. Map the probabilties to another domain (optional)

  4. Apply rule on the mapped values (optional)

  5. Postprocessing (optional)

Parameters
  • mapper (BaseMapper) – Mapper that maps the total results

  • measure (BaseMeasure) – Measure to combine the results

  • post_func (BaseFunction) – Function that will be applied after the test

  • prob_rule (BaseRule) – Rule that will be applied on the probabilities

  • mapped_rule (BaseRule) – Rule that will be applied on the mapped values

autom8qc.qaqc.base.TestGroup.perform(self)

Performs all tests of the test group and merges the results.

Important

The results of the execution will be cached. The group checks if cached results already exists. If so, the results will be returned without performing the pipeline.

Raises

NoItemsExist – If test group doesn’t contain any test.

Returns

Total results of the test group

Return type

pd.Series

autom8qc.qaqc.base.TestGroup.add(self, name, weight=1, test=None)

Adds the given test to the test group. Note that test must be an instance of TestManager or TestGroup.

Raises
Parameters
  • name (str) – Name of handle the tests

  • weight (float) – Weight of the test

  • test (TestManager or TestGroup) – Manager to handle the test or other test group

Returns

None

Return type

None