Implement a new Test
Purpose of this Chapter
The aim of this chapter is to explain how to implement a new test. This will be presented by implementing a simple example. Let’s assume that we want to implement a simple Global Minimum test that checks if a value is lower than the defined minimum. If so, the data point is invalid; otherwise, it’s valid. As already mentioned, the tests return a probability for each data point. In our case, we expect that values lower than the minimum are 0% valid and values greater equals the minimum are 100% valid.
Define the Metadata of the Test
Note
Each test has to inherit from the abstract class QAQCTest and has
to implement the abstract method perform. Moreover, each test has to
provide its metadata. The following metadata has to be provided by each
test: NAME, DESCRIPTION, CATEGORY and SUPPORTED_STRUCTURES.
See also: autom8qc.qaqc.base.QAQCTest
Warning
Make sure that the class name ends with the suffix Test. Other modules will check for the suffix to identify that the class is a test.
import numpy as np
import pandas as pd
from autom8qc.core import exceptions
from autom8qc.core.structures import Series
from autom8qc.core.parameters import Parameter
from autom8qc.core.parameters import ParameterList
from autom8qc.qaqc.base import QAQCTest
from autom8qc.qaqc.categories import LIMIT_TEST
class GlobalMinimumTest(QAQCTest):
NAME = "Global Minimum Test"
DESCRIPTION = "Checks if a data point falls below the defined minimum"
CATEGORY = LIMIT_TEST
SUPPORTED_STRUCTURES = Series
Define the Supported Parameters
Each test has to provide its supported parameters. Therefore, you have to implement the static method supported_parameters. The method allows you to access the supported parameters without creating an instance of the class. If your test doesn’t need additional parameters, you don’t have to implement it. In our case, we have the additional parameter min_val which defines the lower limit for the values.
@staticmethod
def supported_parameters():
return ParameterList(
Parameter(
name="min_val",
description="Limit for valid values",
dtype=float,
optional=False,
)
)
Implement the Constructor
The constructor is a method that is called when an object is created. In our case, we have to pass the parameter min_val to the constructor and store the value in the related Parameter which we defined in the method supported_parameters. Note that you don’t have to check the type of the parameters since a Parameter checks the type when you set the value. If you want to implement additional checks (e.g., the value must be greater than 0), you have to implement them in the constructor and raise an exception if a constraint is not satisfied. Finally, you have to call the super method check_metadata that checks if the instance is valid.
Warning
Make sure that you call the super constructor before assigning the parameters.
def __init__(self, min_val):
if 0 >= min_val:
raise exceptions.InvalidValue("min_val must be greater than 0!")
super().__init__()
self.parameters["min_val"] = min_val
self.check_metadata()
Implement the Abstract Method
Finally, we have to implement the abstract method perform. The method expects the data as a parameter and returns the probabilities (0: Invalid, 1: Valid) for each data point. It’s up to you to define which structures are supported. For our example, we assume that we only support series (Series or pd.Series). If a non-series will pass to the method, our method will raise an error. Therefore, we use the super-method get_data that returns the values of the series and raises an error if the type is invalid.
Warning
Make sure that you don’t override data points.
def perform(self, data):
series = self.get_data(series)
min_val = self.parameters["min_val"].value
mask = probabilities[series >= min_val]
probabilities = pd.Series(mask, index=series.index, dtype=np.float64)
probabilities[series.isna()] = np.nan
return probabilities