Customize Test Groups

Purpose of this Chapter

The aim of this chapter is to demonstrate how to customize test groups. Therefore, the steps will be explained by taking the example of a Nitrogen Oxides (NO) time series.

Basic Example

Let’s assume that we want to create a test group with tests that are based on each other. The test group contains a test that checks for negative values and a test that checks if the time is monotonically increasing. Moreover, it includes a test that checks if too high measurements exist at night. Therefore, we check if measurements were done at night, and we check if the values are over the Limit of Detection (LoD). In total, our example group has the following structure:

../../_images/group_structure.svg

Note

If you want to add a QAQCTest to a TestGroup, you have to create a TestManager for it. Moreover, you can also add TestGroups and TestSequences to a TestGroup.

In our example, we have to create two TestManagers for the Increasing Time Test and for the Negative Test. For the TestGroup Nighttime Test we create an instance of the class TestGroup and two instances of the class TestManagers for the tests LOD Test and Day Test. To combine the results of the TestGroups, we use the measures ORMeasure and WorstMeasure.

Read Sources and Create Series

A TestManager expects an instance of the class autom8qc.core.structures.BaseStructure which defines the interface for every data structure. On first look, it looks like a unnecessary overhead. But this makes it possible that managers can also handle data frames or other data structures. For our example, we create instances of the class autom8qc.core.structures.Series for the time series Nitrogen oxides and Solar zenith angle.

import pandas as pd
no_data = pd.read_csv("no_data.csv", index_col=0, squeeze=True, parse_dates=True)
sa_data = pd.read_csv("sa_data.csv", index_col=0, squeeze=True, parse_dates=True)

from autom8qc.core.structure import Series
nitrogen_oxides = Series(name="Nitrogen oxides", data=no_data)
solar_angle = Series(name="Solar zenith angle", data=sa_data)

Increasing Time Test

The Increasing Time Test checks if the time is monotonically increasing. If more than 5% of the data points fails the test, all data points will be marked as Erroneous.

from autom8qc.mappers.validities import StandardValidityMapper
from autom8qc.qaqc.base import TestManager
from autom8qc.qaqc.general import IncreasingTimeTest
from autom8qc.rules.frequency import InvalidFrequencyRule

test = IncreasingTimeTest()
time_test = TestManager(
    data=nitrogen_oxides,
    test=test,
    mapper=StandardValidityMapper(),
    mapped_rule=InvalidFrequencyRule(rel_frequency=0.05)
)

Negative Test

The Negative Test checks if values are lower than 0. It’s impossible that a ratio is lower than 0, but it could happen since our instrument has a limit of detection and is noisy. Therefore, values between 0 and -0.072 are limited and have a probability between 0 and 1. We want to map the probabilities to the standard validities.

from autom8qc.mappers.validities import StandardValidityMapper
from autom8qc.qaqc.base import TestManager
from autom8qc.qaqc.limit import GlobalMinimumTest

test = GlobalMinimumTest(min_val=0, min_lim=-0.072)
mapper = StandardValidityMapper()
negative_test = TestManager(no_series, test, mapper)

Subgroup - Nighttime Test

The Nighttime test combines two tests with a logical OR. The test will fail if too high measurements exists at night. First, we create two TestManagers for the tests. After it, we will create a test group and combines the results with the measure ORMeasure.

Day test

For preprocessing and postprocessing, you can use the module autom8qc.functions. The Solar angle series has a 4 second resolution and the NO series has a 1 second resolution. To combine the tests, we have to resample and interpolate the values before. After it, we will perform the GlobalMaximumTest and use the mapper LogicalThresholdMapper to map the probabilities that are higher than 0.5 to True and probabilities lower equals than 0.5 to False.

from autom8qc.functions.interpolation import Interpolation
from autom8qc.mappers.logical import LogicalThresholdMapper
from autom8qc.qaqc.base import TestManager
from autom8qc.qaqc.limit import GlobalMaximumTest

test = GlobalMaximumTest(max_val=100)
mapper = LogicalThresholdMapper()
pre_function = Interpolation("1s", "linear")
day_test = TestManager(sa_series, test, mapper, pre_function)

Limit of Detection Test

Next, we want to create a manager to detect higher values than the limit of detection. Therefore we perform the GlobalMaximumTest.

from autom8qc.mappers.logical import LogicalThresholdMapper
from autom8qc.qaqc.base import TestManager
from autom8qc.qaqc.limit import GlobalMaximumTest

test = GlobalMaximumTest(max_val=0.05)
mapper = LogicalThresholdMapper()
lod_test = TestManager(no_series, test, mapper)

Create Subgroup

Finally, we want to create a subgroup. Therefore, we define the measure ORMeasure that applies the logical OR to the results. Moreover, we use the mapper Logical2ValidityMapper to map bools to validities.

from autom8qc.core.validities import StandardValidities
from autom8qc.mappers.logical import Logical2ValidityMapper
from autom8qc.measures.logical import ORMeasure
from autom8qc.qaqc.base import TestGroup

validities = StandardValidities()
l2v_mapper = Logical2ValidityMapper(validities.GOOD, validities.ERRONEOUS, validities.MISSING)
measure = ORMeasure()
sub_group = TestGroup(measure=measure, mapper=l2v_mapper)
sub_group.add(name="Day test", test=day_test)
sub_group.add(name="Limit Of Detection test", test=lod_test)

Final Group

Note

The test group will be Post-order traverse. That means that the child will be performed first. See also: https://en.wikipedia.org/wiki/Tree_traversal

To summarize, we have a subgroup that checks if too high values at night exist, awe have a test that checks for negative values and we have a test that checks if the time is monotonically increasing. All three components returns validities that will be combined with the measure WorstMeasure. After we built the group, we will perform it with the method perform.

from autom8qc.measures.validities import WorstMeasure
from autom8qc.qaqc.base import TestGroup

group = TestGroup(measure=WorstMeasure())
group.add(name="Increasing Time Test", test=time_test)
group.add(name="Negative Test", test=negative_test)
group.add(name="Nighttime Test", test=sub_group)
results = group.perform()