Customize Test Groups
Purpose of this Chapter
The aim of this chapter is to demonstrate how to customize test groups. Therefore, the steps will be explained by taking the example of a Nitrogen Oxides (NO) time series.
Basic Example
Let’s assume that we want to create a test group with tests that are based on each other. The test group contains a test that checks for negative values and a test that checks if the time is monotonically increasing. Moreover, it includes a test that checks if too high measurements exist at night. Therefore, we check if measurements were done at night, and we check if the values are over the Limit of Detection (LoD). In total, our example group has the following structure:
Note
If you want to add a QAQCTest to a TestGroup, you have to create a TestManager for it. Moreover, you can also add TestGroups and TestSequences to a TestGroup.
In our example, we have to create two TestManagers for the Increasing Time Test and for the Negative Test. For the TestGroup Nighttime Test we create an instance of the class TestGroup and two instances of the class TestManagers for the tests LOD Test and Day Test. To combine the results of the TestGroups, we use the measures ORMeasure and WorstMeasure.
Read Sources and Create Series
A TestManager expects an instance of the class autom8qc.core.structures.BaseStructure which defines the interface for every data structure. On first look, it looks like a unnecessary overhead. But this makes it possible that managers can also handle data frames or other data structures. For our example, we create instances of the class autom8qc.core.structures.Series for the time series Nitrogen oxides and Solar zenith angle.
import pandas as pd
no_data = pd.read_csv("no_data.csv", index_col=0, squeeze=True, parse_dates=True)
sa_data = pd.read_csv("sa_data.csv", index_col=0, squeeze=True, parse_dates=True)
from autom8qc.core.structure import Series
nitrogen_oxides = Series(name="Nitrogen oxides", data=no_data)
solar_angle = Series(name="Solar zenith angle", data=sa_data)
Increasing Time Test
The Increasing Time Test checks if the time is monotonically increasing. If more than 5% of the data points fails the test, all data points will be marked as Erroneous.
from autom8qc.mappers.validities import StandardValidityMapper
from autom8qc.qaqc.base import TestManager
from autom8qc.qaqc.general import IncreasingTimeTest
from autom8qc.rules.frequency import InvalidFrequencyRule
test = IncreasingTimeTest()
time_test = TestManager(
data=nitrogen_oxides,
test=test,
mapper=StandardValidityMapper(),
mapped_rule=InvalidFrequencyRule(rel_frequency=0.05)
)
Negative Test
The Negative Test checks if values are lower than 0. It’s impossible that a ratio is lower than 0, but it could happen since our instrument has a limit of detection and is noisy. Therefore, values between 0 and -0.072 are limited and have a probability between 0 and 1. We want to map the probabilities to the standard validities.
from autom8qc.mappers.validities import StandardValidityMapper
from autom8qc.qaqc.base import TestManager
from autom8qc.qaqc.limit import GlobalMinimumTest
test = GlobalMinimumTest(min_val=0, min_lim=-0.072)
mapper = StandardValidityMapper()
negative_test = TestManager(no_series, test, mapper)
Subgroup - Nighttime Test
The Nighttime test combines two tests with a logical OR. The test will fail if too high measurements exists at night. First, we create two TestManagers for the tests. After it, we will create a test group and combines the results with the measure ORMeasure.
Day test
For preprocessing and postprocessing, you can use the module autom8qc.functions. The Solar angle series has a 4 second resolution and the NO series has a 1 second resolution. To combine the tests, we have to resample and interpolate the values before. After it, we will perform the GlobalMaximumTest and use the mapper LogicalThresholdMapper to map the probabilities that are higher than 0.5 to True and probabilities lower equals than 0.5 to False.
from autom8qc.functions.interpolation import Interpolation
from autom8qc.mappers.logical import LogicalThresholdMapper
from autom8qc.qaqc.base import TestManager
from autom8qc.qaqc.limit import GlobalMaximumTest
test = GlobalMaximumTest(max_val=100)
mapper = LogicalThresholdMapper()
pre_function = Interpolation("1s", "linear")
day_test = TestManager(sa_series, test, mapper, pre_function)
Limit of Detection Test
Next, we want to create a manager to detect higher values than the limit of detection. Therefore we perform the GlobalMaximumTest.
from autom8qc.mappers.logical import LogicalThresholdMapper
from autom8qc.qaqc.base import TestManager
from autom8qc.qaqc.limit import GlobalMaximumTest
test = GlobalMaximumTest(max_val=0.05)
mapper = LogicalThresholdMapper()
lod_test = TestManager(no_series, test, mapper)
Create Subgroup
Finally, we want to create a subgroup. Therefore, we define the measure ORMeasure that applies the logical OR to the results. Moreover, we use the mapper Logical2ValidityMapper to map bools to validities.
from autom8qc.core.validities import StandardValidities
from autom8qc.mappers.logical import Logical2ValidityMapper
from autom8qc.measures.logical import ORMeasure
from autom8qc.qaqc.base import TestGroup
validities = StandardValidities()
l2v_mapper = Logical2ValidityMapper(validities.GOOD, validities.ERRONEOUS, validities.MISSING)
measure = ORMeasure()
sub_group = TestGroup(measure=measure, mapper=l2v_mapper)
sub_group.add(name="Day test", test=day_test)
sub_group.add(name="Limit Of Detection test", test=lod_test)
Final Group
Note
The test group will be Post-order traverse. That means that the child will be performed first. See also: https://en.wikipedia.org/wiki/Tree_traversal
To summarize, we have a subgroup that checks if too high values at night exist, awe have a test that checks for negative values and we have a test that checks if the time is monotonically increasing. All three components returns validities that will be combined with the measure WorstMeasure. After we built the group, we will perform it with the method perform.
from autom8qc.measures.validities import WorstMeasure
from autom8qc.qaqc.base import TestGroup
group = TestGroup(measure=WorstMeasure())
group.add(name="Increasing Time Test", test=time_test)
group.add(name="Negative Test", test=negative_test)
group.add(name="Nighttime Test", test=sub_group)
results = group.perform()