autom8qc.rules
autom8qc.rules.base
This module provides the abstract base class BaseRule to implement rules. Each rule has to inherit from the class and has to implement the abstract method apply. Rules can be used to define rules for the test that will be applied on the results. An example is that all data points are set to invalid if over 20% of the data points are invalid.
See also
- class autom8qc.rules.base.BaseRule
Abstract base class that defines the interface for rules. Each rule must provide its information (NAME, DESCRIPTION) and must implement the abstract method apply. Rules can be used to define rules for the test that will be applied on the results. An example is that all data points are set to invalid if over 20% of the data points are invalid.
See also
autom8qc.core.components.BaseComponent
autom8qc.core.parameters.ParameterList
Warning
If you inherit from this class, make sure that you call the super constructor and implement the abstract method apply.
- Parameters
NAME (string) – Name of the function
DESCRIPTION (string) – Description of the function
parameters (ParameterList) – Supported parameters (default: None)
- abstract apply(data)
Applies the rule to the given data.
Warning
Make sure that you don’t override data points of the data. For effiency reasons the data won’t be copied.
- Raises
NotImplementedError – This is an abstract method
- Parameters
data (BaseStructure) – Series, DataFrame, or mapped values
- Returns
Results of the function
- Return type
pd.Series
- plot(data, mapped_values=None, style='.-', title=None)
Plots the mapped values. If the parameter mapped_values is not passed, the given values will be mapped before.
- Parameters
data (object) – Values
mapped_values (pd.Series) – Mapped values (optional)
style (string) – Style of the line (default: line with points)
- Returns
None
- Return type
None
- savefig(filename, data, mapped_values=None, style='.-', title=None)
Saves the plot.
Important
The extension of the filename has to be the format. For example, if you want to store a SVG figure, the filename has to be ./example.svg
- Parameters
filename (string) – Name of the file
data (object) – Values
mapped_values (pd.Series) – Mapped values (optional)
style (string) – Style of the line (default: line with points)
title (string) – Title of the plot (default: NAME)
- Returns
None
- Return type
None
autom8qc.rules.frequency
autom8qc.rules.frequency.InvalidFrequencyRule
Class
- class autom8qc.rules.frequency.InvalidFrequencyRule(validity=None, missing_validity=None, rel_frequency=0.1)
Bases:
autom8qc.rules.base.BaseRule
This class implements a rule that checks the frequency of invalid data points. If the relative frequency rel_frequency of the invalid data points is greater than the defined relative frequency, the rule will set all validities to Invalid.
- Parameters
NAME (string) – Name of the rule
DESCRIPTION (string) – Description of the rule
parameters (ParameterList) – Supported parameters
- Supported parameters:
validity (Validity): Invalid validity (optional)
rel_frequency (float): Relative frequency (default: 0.1)
- apply(data)
Applies the rule to the given data.
- Parameters
data – Validities
- Returns
data
- Return type
pd.Series
- static supported_parameters()
Returns the supported parameters.
- Returns
Supported parameters
- Return type
ParameterList
Example
# Generate sample data
import numpy as np
import pandas as pd
from autom8qc.core.validities import StandardValidities
np.random.seed(42)
validities = StandardValidities()
ids = np.random.randint(4, size=50)
values = np.array([validities.ALL_VALIDITIES[id] for id in ids])
values[10:20] = validities.ERRONEOUS
index = pd.date_range(start="1/1/2021", periods=50, freq="1min")
series = pd.Series(values, index=index)
from autom8qc.rules.frequency import InvalidFrequencyRule
rule = InvalidFrequencyRule(rel_frequency=0.1)
rule.plot(series)
Visualization
autom8qc.rules.frequency.LowerFrequencyRule
Class
- class autom8qc.rules.frequency.LowerFrequencyRule(threshold, rel_frequency=0.1)
Bases:
autom8qc.rules.base.BaseRule
This class implements a rule that checks the frequency of invalid data points. A data point is invalid if the probability is lower equals than the defined threshold. If the relative frequency rel_frequency of the invalid data points is greater than the defined relative frequency, the rule will set all probabilities to 0.
Warning
NaN values will be ignored by the rule. If you have the probabilities [0.6, 0.55, 0.42, np.nan, 1] and your threshold is 0.5, the relative frequency of failed points will be 0.25.
- Parameters
NAME (string) – Name of the rule
DESCRIPTION (string) – Description of the rule
parameters (ParameterList) – Supported parameters
- Supported parameters:
threshold (float): Threshold value
rel_frequency (float): Relative frequency (default: 0.1)
- apply(data)
Applies the rule to the given data.
- Parameters
data (pd.Series) – Probabilities
- Returns
Probabilities
- Return type
pd.Series
- static supported_parameters()
Returns the supported parameters.
- Returns
Supported parameters
- Return type
ParameterList
Example
# Generate sample data
import numpy as np
import pandas as pd
np.random.seed(42)
values = np.random.random(50)
values[10:20] = 0
index = pd.date_range(start="1/1/2021", periods=50, freq="1min")
series = pd.Series(values, index=index)
from autom8qc.rules.frequency import LowerFrequencyRule
rule = LowerFrequencyRule(threshold=0.1, rel_frequency=0.1)
rule.plot(series)