autom8qc.rules

autom8qc.rules.base

This module provides the abstract base class BaseRule to implement rules. Each rule has to inherit from the class and has to implement the abstract method apply. Rules can be used to define rules for the test that will be applied on the results. An example is that all data points are set to invalid if over 20% of the data points are invalid.

class autom8qc.rules.base.BaseRule

Abstract base class that defines the interface for rules. Each rule must provide its information (NAME, DESCRIPTION) and must implement the abstract method apply. Rules can be used to define rules for the test that will be applied on the results. An example is that all data points are set to invalid if over 20% of the data points are invalid.

See also

  • autom8qc.core.components.BaseComponent

  • autom8qc.core.parameters.ParameterList

Warning

If you inherit from this class, make sure that you call the super constructor and implement the abstract method apply.

Parameters
  • NAME (string) – Name of the function

  • DESCRIPTION (string) – Description of the function

  • parameters (ParameterList) – Supported parameters (default: None)

abstract apply(data)

Applies the rule to the given data.

Warning

Make sure that you don’t override data points of the data. For effiency reasons the data won’t be copied.

Raises

NotImplementedError – This is an abstract method

Parameters

data (BaseStructure) – Series, DataFrame, or mapped values

Returns

Results of the function

Return type

pd.Series

plot(data, mapped_values=None, style='.-', title=None)

Plots the mapped values. If the parameter mapped_values is not passed, the given values will be mapped before.

Parameters
  • data (object) – Values

  • mapped_values (pd.Series) – Mapped values (optional)

  • style (string) – Style of the line (default: line with points)

Returns

None

Return type

None

savefig(filename, data, mapped_values=None, style='.-', title=None)

Saves the plot.

Important

The extension of the filename has to be the format. For example, if you want to store a SVG figure, the filename has to be ./example.svg

Parameters
  • filename (string) – Name of the file

  • data (object) – Values

  • mapped_values (pd.Series) – Mapped values (optional)

  • style (string) – Style of the line (default: line with points)

  • title (string) – Title of the plot (default: NAME)

Returns

None

Return type

None

autom8qc.rules.frequency

autom8qc.rules.frequency.InvalidFrequencyRule

Class

class autom8qc.rules.frequency.InvalidFrequencyRule(validity=None, missing_validity=None, rel_frequency=0.1)

Bases: autom8qc.rules.base.BaseRule

This class implements a rule that checks the frequency of invalid data points. If the relative frequency rel_frequency of the invalid data points is greater than the defined relative frequency, the rule will set all validities to Invalid.

Parameters
  • NAME (string) – Name of the rule

  • DESCRIPTION (string) – Description of the rule

  • parameters (ParameterList) – Supported parameters

Supported parameters:
  • validity (Validity): Invalid validity (optional)

  • rel_frequency (float): Relative frequency (default: 0.1)

apply(data)

Applies the rule to the given data.

Parameters

data – Validities

Returns

data

Return type

pd.Series

static supported_parameters()

Returns the supported parameters.

Returns

Supported parameters

Return type

ParameterList

Example

# Generate sample data
import numpy as np
import pandas as pd
from autom8qc.core.validities import StandardValidities

np.random.seed(42)
validities = StandardValidities()
ids = np.random.randint(4, size=50)
values = np.array([validities.ALL_VALIDITIES[id] for id in ids])
values[10:20] = validities.ERRONEOUS
index = pd.date_range(start="1/1/2021", periods=50, freq="1min")
series = pd.Series(values, index=index)

from autom8qc.rules.frequency import InvalidFrequencyRule
rule = InvalidFrequencyRule(rel_frequency=0.1)
rule.plot(series)

Visualization

../_images/InvalidFrequencyRule.svg

autom8qc.rules.frequency.LowerFrequencyRule

Class

class autom8qc.rules.frequency.LowerFrequencyRule(threshold, rel_frequency=0.1)

Bases: autom8qc.rules.base.BaseRule

This class implements a rule that checks the frequency of invalid data points. A data point is invalid if the probability is lower equals than the defined threshold. If the relative frequency rel_frequency of the invalid data points is greater than the defined relative frequency, the rule will set all probabilities to 0.

Warning

NaN values will be ignored by the rule. If you have the probabilities [0.6, 0.55, 0.42, np.nan, 1] and your threshold is 0.5, the relative frequency of failed points will be 0.25.

Parameters
  • NAME (string) – Name of the rule

  • DESCRIPTION (string) – Description of the rule

  • parameters (ParameterList) – Supported parameters

Supported parameters:
  • threshold (float): Threshold value

  • rel_frequency (float): Relative frequency (default: 0.1)

apply(data)

Applies the rule to the given data.

Parameters

data (pd.Series) – Probabilities

Returns

Probabilities

Return type

pd.Series

static supported_parameters()

Returns the supported parameters.

Returns

Supported parameters

Return type

ParameterList

Example

# Generate sample data
import numpy as np
import pandas as pd

np.random.seed(42)
values = np.random.random(50)
values[10:20] = 0
index = pd.date_range(start="1/1/2021", periods=50, freq="1min")
series = pd.Series(values, index=index)

from autom8qc.rules.frequency import LowerFrequencyRule
rule = LowerFrequencyRule(threshold=0.1, rel_frequency=0.1)
rule.plot(series)

Visualization

../_images/LowerFrequencyRule.svg