autom8qc.qaqc.base
autom8qc.qaqc.base.QAQCContainer
- class autom8qc.qaqc.base.QAQCContainer
This class implements a container to collect independent test groups, test sequences, and test managers. The class provides the method perform to perform every element in the container. Internally the class manages the elements in a dictionary.
Important
This class is primarily designed to integrate the framework into an already existing system. It gives you the possibility to collect all tests that should be performed in the container and perform them with one call. Moreover, it gives you the possibility to access the cached results efficiently and easily.
See also
autom8qc.core.structures.QAQCComponent
- add_element(elem, name)
Adds the given element to the container.
- Raises
InvalidType – If name already exists
InvalidType – If name is not a string
EmptyString – If name is empty
- Parameters
elem (QAQCComponent) – Element
name (str) – Name of the component
- Returns
None
- Return type
None
- perform()
Performs all elements in the container.
- Returns
None
- Return type
None
autom8qc.qaqc.base.QAQCTest
- class autom8qc.qaqc.base.QAQCTest
This class defines the interface for each QA/QC test. Each test has to provide its information (NAME, DESCRIPTION, CATEGORY, parameters and SUPPORTED_STRUCTURES). Moreover, each test has to implement the abstract method perform which expects an instance of the class BaseStructure and returns a probability (between 0 and 1) for each data point.
See also
autom8qc.core.components.BaseComponent
autom8qc.core.parameters.ParameterList
autom8qc.core.structures.BaseStructure
Warning
If you inherit from this class, make sure that you call the super constructor and implement the abstract method perform.
- Parameters
NAME (str) – Name of the test
DESCRIPTION (str) – Description of the test
CATEGORY (str) – Category of the test
SUPPORTED_STRUCTURES (tuple or BaseStructure) – Supported data structures (e.g., Series)
parameters (ParameterList) – Supported parameters (default: None)
- check_category()
Checks if the CATEGORY is set and is supported by the package.
- Raises
EmptyString – If CATEGORY is not set or empty
InvalidType – If CATEGORY is not a string
InvalidValue – If CATEGORY is not supported
- Returns
None
- Return type
None
- check_metadata()
Checks if the defined metadata for the test is valid.
- Raises
EmptyString – If NAME is not set or empty
InvalidType – If NAME is not a string
EmptyString – If DESCRIPTION is not set or empty
InvalidType – If DESCRIPTION is not a string
InvalidType – If PARAMETERS is not an instance of ParameterList
InvalidValue – If an item in PARAMETERS is not a Parameter
InvalidType – Structure is not correctly defined
EmptyString – If CATEGORY is not set or empty
InvalidType – If CATEGORY is not a string
InvalidValue – If CATEGORY is not supported
- Returns
None
- Return type
None
- check_structures()
Checks if the supported structures are correctly set.
- Raises
InvalidType – Structure is not correctly defined
- Returns
None
- Return type
None
- get_data(structure)
Returns the data of the structure. For example, if your test supports Series, the method will return pd.Series. Moreover, if the given data data is not supported, the method will raise an error.
- Raises
InvalidType – If structure is not supported
- Parameters
structure (BaseStructure) – Data structure
- Returns
Data of the structure
- Return type
object
- property metadata
Returns the metadata of the test in a dictionary.
- Returns
Metadata of the series
- Return type
dict
- abstract perform(data)
Performs the test and returns the probabilities.
Warning
Make sure that you don’t override data points of the data. For effiency reasons the data won’t be copied.
Important
Use the method get_data to access the data. The method checks if the given data is supported by the test. Moreover, it ensures that your test supports instances of the class BaseStructure and the dtype of the BaseStructure. For example, if your test supports Series, you can pass the types pd.Series and autom8qc.core.structures.Series.
- Raises
NotImplementedError – This is an abstract method
- Parameters
data (BaseStructure, pd.Series, pd.DataFrame) – Data points
- Returns
Probabilities (1=Valid, 0=Invalid)
- Return type
pd.Series
- plot(series, probabilities=None, series_name=None)
This method plots the results of the test. Therefore, the method performs the test and plots two axes. You can see the series on the left axis, and on the right axis, you can see the values that are marked. Data points which failed the test (i.e., 0% valid) are marked red. Data points which pass the test (i.e., 100% valid ) are marked green. Optionally, you can use the parameter probabilities to avoid calculation twice.
- Parameters
series (pd.Series) – Series
probabilities (pd.Series) – Results of the test (optional)
series_name (str) – Name of the series (optional)
- Returns
None
- Return type
None
- savefig(filename, series, probabilities=None, series_name=None)
Saves the plot.
Important
The extension of the filename has to be the format. For example, if you want to store a SVG figure, the filename has to be ./example.svg
- Parameters
filename (string) – Name of the file
series (pd.Series) – Series
probabilities (pd.Series) – Results of the test (optional)
series_name (str) – Name of the series (optional)
- Returns
None
- Return type
None
autom8qc.qaqc.base.TestCategory
- class autom8qc.qaqc.base.TestCategory
This class provides all categories that are supported by the framework. With this approach, you can make sure that tests of the same categories always have the same category name. Currently, the following categories are supported:
TestCategory.GENERAL
TestCategory.LIMIT_TEST
TestCategory.OUTLIER_TEST
TestCategory.FLATLINE_DETECTION
TestCategory.PEAK_DETECTION
See also
autom8qc.qaqc.base.TestGroup
- class autom8qc.qaqc.base.TestGroup(measure, mapper=None, post_func=None, prob_rule=None, mapped_rule=None)
A TestGroup gives you the possibility to perform several tests isolated and merge the results with a measure. In comparison to a sequence or a manager, a test group isn’t bound to any data. It only performs the items (which are bound to data) and merges the results.
Warning
A test group handles the QA/QC tests isolated and combines the results of the tests. If you want to create a sequence in that QA/QC tests depend on each other, you have to use TestSequence.
See also
- Execution Pipeline:
Perform tests (required)
Apply rule on the probabilities (optional)
Map the probabilties to another domain (optional)
Apply rule on the mapped values (optional)
Postprocessing (optional)
- Parameters
mapper (BaseMapper) – Mapper that maps the total results
measure (BaseMeasure) – Measure to combine the results
post_func (BaseFunction) – Function that will be applied after the test
prob_rule (BaseRule) – Rule that will be applied on the probabilities
mapped_rule (BaseRule) – Rule that will be applied on the mapped values
- add(name, weight=1, test=None)
Adds the given test to the test group. Note that test must be an instance of TestManager or TestGroup.
- Raises
KeyAlreadyExists – If a test with the name already exists
EmptyString – If name is not set or empty
InvalidType – If name is not a string
InvalidValue – If weight is not positive
InvalidType – If weight is not an integer
InvalidType – If test_manager is not an instance of TestManager
- Parameters
name (str) – Name of handle the tests
weight (float) – Weight of the test
test (TestManager or TestGroup) – Manager to handle the test or other test group
- Returns
None
- Return type
None
- add_item(item)
Adds a group item to the group.
- Parameters
item (GroupItem) – Group item
- property mapped_rule
Returns the rule that will be applied on the mapped values.
- Returns
Rule that will be applied on the mapped values.
- Return type
- property mapper
Returns the mapper.
- Returns
Mapper that will be applied on the probabilities.
- Return type
- property measure
Returns the measure.
- Returns
Measure to combine the results
- Return type
- perform()
Performs all tests of the test group and merges the results.
Important
The results of the execution will be cached. The group checks if cached results already exists. If so, the results will be returned without performing the pipeline.
- Raises
NoItemsExist – If test group doesn’t contain any test.
- Returns
Total results of the test group
- Return type
pd.Series
- property post_function
Returns the function that will be applied after the test.
- Returns
Function that will be applied after the test.
- Return type
- property prob_rule
Returns the rule that will be applied on the probabilities.
- Returns
Rule that will be applied on the probabilities.
- Return type
- property results
Returns the results of the tests as a pd.DataFrame. If the results for a test don’t exist, the test will be performed first. Otherwise the cached results will be used.
Warning
The results are the results of each test and not the total results. If you want to access the results that are combined with the measure, you need to use the property total_results
- Returns
Results of the tests
- Return type
pd.DataFrame
- property tests
Returns the names of the tests.
- Returns
Names of the tests
- Return type
List<str>
- property total_results
Returns the combined results of the test group.
Note
If you want to access the results of each test, you have to use the property results.
- Returns
Total results
- Return type
pd.Series
autom8qc.qaqc.base.TestManager
- class autom8qc.qaqc.base.TestManager(data, test, mapper=None, pre_func=None, post_func=None, prob_rule=None, mapped_rule=None, filter_options=None)
A TestManager simplifies the execution of a test and caches the results. Each manager expects an instance of the class BaseStructure to handle the data in standardized way. Moreover, you have to define the test that should be executed (e.g., Global Minimum Test). Optional, you can specify a mapper that maps the probabilities to other values (e.g., validities). You can also define a pre-processing function and a post-processing function that will be applied before the execution of the test (e.g., linear interpolation) or on the results (e.g., filling gaps). In addition, you can also define rules that will be applied on the probabilities and mapped values.
See also
autom8qc.core.structures.BaseStructure
- Execution Pipeline:
Preprocessing (optional)
Perform test (required)
Apply rule on the probabilities (optional)
Map the probabilties to another domain (optional)
Apply rule on the mapped values (optional)
Postprocessing (optional)
- Parameters
data (BaseStructure) – Data (e.g., time series)
test (QAQCTest) – Test
mapper (BaseMapper) – Mapper to map the probabilities
probabilities (pd.Series) – Probabilities
mapped_values (pd.Series) – Mapped values
pre_function (BaseFunction) – Function that will be applied before the test
post_function (BaseFunction) – Function that will be applied after the test
prob_rule (BaseRule) – Rule that will be applied on the probabilities
mapped_rule (BaseRule) – Rule that will be applied on the mapped values
filter_options (dict) – Options to filter the data
- clear()
Clears the cache.
- Returns
None
- Return type
None
- property mapped_rule
Returns the rule that will be applied on the mapped values.
- Returns
Rule that will be applied on the mapped values.
- Return type
- property mapper
Returns the mapper.
- Returns
Mapper that will be applied on the probabilities.
- Return type
- perform()
Performs the mapper and returns the results.
Important
The results of the execution will be cached. The manager checks if cached results already exists. If so, the results will be returned without performing the pipeline.
- Returns
Probabilities or mapped values
- Return type
pd.Series
- property post_function
Returns the function that will be applied after the test.
- Returns
Function that will be applied after the test.
- Return type
- property pre_function
Returns the function that will be applied before the test.
- Returns
Function that will be applied before the test.
- Return type
- property prob_rule
Returns the rule that will be applied on the probabilities.
- Returns
Rule that will be applied on the probabilities.
- Return type
- property results
Returns the results of the test.
- Returns
Results of the test
- Return type
pd.Series
autom8qc.qaqc.base.TestSequence
- class autom8qc.qaqc.base.TestSequence(data, mapper=None, pre_func=None, post_func=None, prob_rule=None, mapped_rule=None, measure=None, filter_options=None)
A TestSequence allows you to create a sequence of several tests that are based on each other. If a data point failed a test, it won’t pass to the next test. A data point failed the test, if the probability is lower than the defined threshold. With this approach, you can ensure that invalid data points don’t affect the next test. Especially, if you use tests that consider the global range of the data (e.g., autom8qc.qaqc.outlier.LOFTest) it’s highly recommended to use this data structure.
Important
If you don’t pass a measure to the constructor, the autom8qc.measures.probabilities.WorstProbabilityMeasure will be used.
Warning
A test group handles the QA/QC tests isolated and combines the results of the tests. If your tests are not depending on each other, you need to use a TestGroup.
See also
autom8qc.core.structures.BaseStructure
- Execution Pipeline:
Preprocessing (optional)
Perform tests (required)
Apply rule on the probabilities (optional)
Map the probabilties to another domain (optional)
Apply rule on the mapped values (optional)
Postprocessing (optional)
- Parameters
data (BaseStructure) – Data that shall be tested
mapper (BaseMapper) – Mapper to map the final results
pre_function (BaseFunction) – Function that will be applied before the sequence
post_function (BaseFunction) – Function that will be applied after the sequence
prob_rule (BaseRule) – Rule that will be applied on the probabilities
mapped_rule (BaseRule) – Rule that will be applied on the mapped values
measure (BaseMeasure) – Measure for the total result
filter_options (dict) – Options to filter the data
- add_item(item, stage=None)
Add an item to the list.
- Parameters
item (SequenceItem) – Item that should be added
- Returns
None
- Return type
None
- add_test(test, threshold, name, stage=None, weight=1)
Adds the test to the sequence. Each test needs a threshold (between 0 and 1) to filter the good values. If a probability is lower than the defined threshold, it won’t pass to the next test. Optional, you can use the parameter stage to set the position of test. If the stage is not set, the test will be appended.
Warning
The sequence is zero-based. If you want to add a new test at the first position, you have to pass 0 for the stage.
- Raises
InvalidType – Weight must be an integer
InvalidValue – Weight must be positive
- Parameters
test (QAQCTest) – Test that should be performed
threshold (float) – Threshold to filter the good points (0, 1)
name (str) – Name of the test
stage (int) – Stage (position) of the test (optionally)
weight (int) – Weight of the test
- Returns
None
- Return type
None
- property mapped_rule
Returns the rule that will be applied on the mapped values.
- Returns
Rule that will be applied on the mapped values.
- Return type
- property mapper
Returns the mapper.
- Returns
Mapper that will be applied on the probabilities.
- Return type
- property measure
Returns the measure.
- Returns
Measure to combine the results
- Return type
- perform()
Performs the sequence and returns the results. If the probabilities will be mapped, then the mapped values will return.
Important
The results of the execution will be cached. The sequence checks if cached results already exists. If so, the results will be returned without performing the pipeline.
- Raises
NoItemsExist – Sequence does not contain any test
- Returns
Probabilities or mapped values
- Return type
pd.Series
- plot(min_val=None, max_val=None)
This method plots the results of the test sequence.
- Returns
None
- Return type
None
- property post_function
Returns the function that will be applied after the test.
- Returns
Function that will be applied after the test.
- Return type
- property pre_function
Returns the function that will be applied before the test.
- Returns
Function that will be applied before the test.
- Return type
- property prob_rule
Returns the rule that will be applied on the probabilities.
- Returns
Rule that will be applied on the probabilities.
- Return type
- property results
Returns the results of the test.
- Returns
Results of the test
- Return type
pd.Series
- property sequence_results
Returns the results of all stages in a pd.DataFrame.
- Returns
Results of all stages
- Return type
pd.DataFrame