********************* Customize Test Groups ********************* Purpose of this Chapter ======================= The aim of this chapter is to demonstrate how to customize test groups. Therefore, the steps will be explained by taking the example of a *Nitrogen Oxides (NO)* time series. Basic Example ============= Let's assume that we want to create a test group with tests that are based on each other. The test group contains a test that checks for negative values and a test that checks if the time is monotonically increasing. Moreover, it includes a test that checks if too high measurements exist at night. Therefore, we check if measurements were done at night, and we check if the values are over the *Limit of Detection (LoD)*. In total, our example group has the following structure: .. figure:: ../../figures/examples/group_structure.svg :width: 100% .. note:: If you want to add a **QAQCTest** to a **TestGroup**, you have to create a **TestManager** for it. Moreover, you can also add **TestGroups** and **TestSequences** to a **TestGroup**. In our example, we have to create two **TestManagers** for the *Increasing Time Test* and for the *Negative Test*. For the **TestGroup** *Nighttime Test* we create an instance of the class **TestGroup** and two instances of the class **TestManagers** for the tests *LOD Test* and *Day Test*. To combine the results of the **TestGroups**, we use the measures *ORMeasure* and *WorstMeasure*. Read Sources and Create Series ============================== A **TestManager** expects an instance of the class **autom8qc.core.structures.BaseStructure** which defines the interface for every data structure. On first look, it looks like a unnecessary overhead. But this makes it possible that managers can also handle data frames or other data structures. For our example, we create instances of the class **autom8qc.core.structures.Series** for the time series *Nitrogen oxides* and *Solar zenith angle*. .. code-block:: python import pandas as pd no_data = pd.read_csv("no_data.csv", index_col=0, squeeze=True, parse_dates=True) sa_data = pd.read_csv("sa_data.csv", index_col=0, squeeze=True, parse_dates=True) from autom8qc.core.structure import Series nitrogen_oxides = Series(name="Nitrogen oxides", data=no_data) solar_angle = Series(name="Solar zenith angle", data=sa_data) Increasing Time Test ==================== The **Increasing Time Test** checks if the time is monotonically increasing. If more than *5%* of the data points fails the test, all data points will be marked as *Erroneous*. .. code-block:: python from autom8qc.mappers.validities import StandardValidityMapper from autom8qc.qaqc.base import TestManager from autom8qc.qaqc.general import IncreasingTimeTest from autom8qc.rules.frequency import InvalidFrequencyRule test = IncreasingTimeTest() time_test = TestManager( data=nitrogen_oxides, test=test, mapper=StandardValidityMapper(), mapped_rule=InvalidFrequencyRule(rel_frequency=0.05) ) Negative Test ============= The **Negative Test** checks if values are lower than 0. It's impossible that a ratio is lower than 0, but it could happen since our instrument has a limit of detection and is noisy. Therefore, values between 0 and -0.072 are limited and have a probability between 0 and 1. We want to map the probabilities to the *standard validities*. .. code-block:: python from autom8qc.mappers.validities import StandardValidityMapper from autom8qc.qaqc.base import TestManager from autom8qc.qaqc.limit import GlobalMinimumTest test = GlobalMinimumTest(min_val=0, min_lim=-0.072) mapper = StandardValidityMapper() negative_test = TestManager(no_series, test, mapper) Subgroup - Nighttime Test ========================= The **Nighttime test** combines two tests with a logical *OR*. The test will fail if too high measurements exists at night. First, we create two **TestManagers** for the tests. After it, we will create a test group and combines the results with the measure *ORMeasure*. Day test -------- For preprocessing and postprocessing, you can use the module **autom8qc.functions**. The *Solar angle* series has a 4 second resolution and the *NO* series has a 1 second resolution. To combine the tests, we have to resample and interpolate the values before. After it, we will perform the **GlobalMaximumTest** and use the mapper **LogicalThresholdMapper** to map the probabilities that are higher than *0.5* to *True* and probabilities lower equals than *0.5* to *False*. .. code-block:: python from autom8qc.functions.interpolation import Interpolation from autom8qc.mappers.logical import LogicalThresholdMapper from autom8qc.qaqc.base import TestManager from autom8qc.qaqc.limit import GlobalMaximumTest test = GlobalMaximumTest(max_val=100) mapper = LogicalThresholdMapper() pre_function = Interpolation("1s", "linear") day_test = TestManager(sa_series, test, mapper, pre_function) Limit of Detection Test ----------------------- Next, we want to create a manager to detect higher values than the limit of detection. Therefore we perform the **GlobalMaximumTest**. .. code-block:: python from autom8qc.mappers.logical import LogicalThresholdMapper from autom8qc.qaqc.base import TestManager from autom8qc.qaqc.limit import GlobalMaximumTest test = GlobalMaximumTest(max_val=0.05) mapper = LogicalThresholdMapper() lod_test = TestManager(no_series, test, mapper) Create Subgroup --------------- Finally, we want to create a subgroup. Therefore, we define the measure **ORMeasure** that applies the logical *OR* to the results. Moreover, we use the mapper **Logical2ValidityMapper** to map bools to validities. .. code-block:: python from autom8qc.core.validities import StandardValidities from autom8qc.mappers.logical import Logical2ValidityMapper from autom8qc.measures.logical import ORMeasure from autom8qc.qaqc.base import TestGroup validities = StandardValidities() l2v_mapper = Logical2ValidityMapper(validities.GOOD, validities.ERRONEOUS, validities.MISSING) measure = ORMeasure() sub_group = TestGroup(measure=measure, mapper=l2v_mapper) sub_group.add(name="Day test", test=day_test) sub_group.add(name="Limit Of Detection test", test=lod_test) Final Group =========== .. note:: The test group will be Post-order traverse. That means that the child will be performed first. See also: https://en.wikipedia.org/wiki/Tree_traversal To summarize, we have a subgroup that checks if too high values at night exist, awe have a test that checks for negative values and we have a test that checks if the time is monotonically increasing. All three components returns validities that will be combined with the measure **WorstMeasure**. After we built the group, we will perform it with the method *perform*. .. code-block:: python from autom8qc.measures.validities import WorstMeasure from autom8qc.qaqc.base import TestGroup group = TestGroup(measure=WorstMeasure()) group.add(name="Increasing Time Test", test=time_test) group.add(name="Negative Test", test=negative_test) group.add(name="Nighttime Test", test=sub_group) results = group.perform()