pm4py.algo.filtering.log.attributes package#

PM4Py – A Process Mining Library for Python

Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.

Website: https://processintelligence.solutions Contact: info@processintelligence.solutions

Submodules#

pm4py.algo.filtering.log.attributes.attributes_filter module#

PM4Py – A Process Mining Library for Python

Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.

Website: https://processintelligence.solutions Contact: info@processintelligence.solutions

class pm4py.algo.filtering.log.attributes.attributes_filter.Parameters(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Bases: Enum

ATTRIBUTE_KEY = 'pm4py:param:attribute_key'#
ACTIVITY_KEY = 'pm4py:param:activity_key'#
CASE_ID_KEY = 'pm4py:param:case_id_key'#
PARAMETER_KEY_CASE_GLUE = 'case_id_glue'#
DECREASING_FACTOR = 'decreasingFactor'#
POSITIVE = 'positive'#
STREAM_FILTER_KEY1 = 'stream_filter_key1'#
STREAM_FILTER_VALUE1 = 'stream_filter_value1'#
STREAM_FILTER_KEY2 = 'stream_filter_key2'#
STREAM_FILTER_VALUE2 = 'stream_filter_value2'#
KEEP_ONCE_PER_CASE = 'keep_once_per_case'#
pm4py.algo.filtering.log.attributes.attributes_filter.apply_numeric(log: EventLog, int1: float, int2: float, parameters: Dict[str | Parameters, Any] | None = None) EventLog[source]#

Apply a filter on cases (numerical filter)

Parameters#

log

Log

int1

Lower bound of the interval

int2

Upper bound of the interval

parameters

Possible parameters of the algorithm

Returns#

filtered_df

Filtered dataframe

pm4py.algo.filtering.log.attributes.attributes_filter.apply_numeric_events(log: EventLog, int1: float, int2: float, parameters: Dict[str | Parameters, Any] | None = None) EventLog[source]#

Apply a filter on events (numerical filter)

Parameters#

log

Log

int1

Lower bound of the interval

int2

Upper bound of the interval

parameters
Possible parameters of the algorithm:

Parameters.ATTRIBUTE_KEY => indicates which attribute to filter Parameters.POSITIVE => keep or remove traces with such events?

Returns#

filtered_log

Filtered log

pm4py.algo.filtering.log.attributes.attributes_filter.apply_events(log: EventLog, values: List[str], parameters: Dict[str | Parameters, Any] | None = None) EventLog[source]#

Filter log by keeping only events with an attribute value that belongs to the provided values list

Parameters#

log

log

values

Allowed attributes

parameters
Parameters of the algorithm, including:

Parameters.ACTIVITY_KEY -> Attribute identifying the activity in the log Parameters.POSITIVE -> Indicate if events should be kept/removed

Returns#

filtered_log

Filtered log

pm4py.algo.filtering.log.attributes.attributes_filter.apply(log: EventLog, values: List[str], parameters: Dict[str | Parameters, Any] | None = None) EventLog[source]#

Filter log by keeping only traces that has/has not events with an attribute value that belongs to the provided values list

Parameters#

log

Trace log

values

Allowed attributes

parameters
Parameters of the algorithm, including:

Parameters.ACTIVITY_KEY -> Attribute identifying the activity in the log Parameters.POSITIVE -> Indicate if events should be kept/removed

Returns#

filtered_log

Filtered log

pm4py.algo.filtering.log.attributes.attributes_filter.apply_trace_attribute(log: EventLog, values: List[str], parameters: Dict[str | Parameters, Any] | None = None) EventLog[source]#

Filter a log on the trace attribute values

Parameters#

log

Event log

values

Allowed/forbidden values

parameters
Parameters of the algorithm, including:
  • Parameters.ATTRIBUTE_KEY: the attribute at the trace level to filter

  • Parameters.POSITIVE: boolean (keep/discard values)

Returns#

filtered_log

Filtered log

pm4py.algo.filtering.log.attributes.attributes_filter.filter_log_on_max_no_activities(log: EventLog, max_no_activities: int = 25, parameters: Dict[str | Parameters, Any] | None = None) EventLog[source]#

Filter a log on a maximum number of activities

Parameters#

log

Log

max_no_activities

Maximum number of activities

parameters

Parameters of the algorithm

Returns#

filtered_log

Filtered version of the event log

pm4py.algo.filtering.log.attributes.attributes_filter.filter_log_by_attributes_threshold(log, attributes, variants, vc, threshold, attribute_key='concept:name')[source]#

Keep only attributes which number of occurrences is above the threshold (or they belong to the first variant)

Parameters#

log

Log

attributes

Dictionary of attributes associated with their count

variants

(If specified) Dictionary with variant as the key and the list of traces as the value

vc

List of variant names along with their count

threshold

Cutting threshold (remove attributes which number of occurrences is below the threshold)

attribute_key

(If specified) Specify the activity key in the log (default concept:name)

Returns#

filtered_log

Filtered log

pm4py.algo.filtering.log.attributes.attributes_filter.filter_log_relative_occurrence_event_attribute(log: EventLog, min_relative_stake: float, parameters: Dict[Any, Any] | None = None) EventLog[source]#

Filters the event log keeping only the events having an attribute value which occurs: - in at least the specified (min_relative_stake) percentage of events, when Parameters.KEEP_ONCE_PER_CASE = False - in at least the specified (min_relative_stake) percentage of cases, when Parameters.KEEP_ONCE_PER_CASE = True

Parameters#

log

Event log

min_relative_stake

Minimum percentage of cases (expressed as a number between 0 and 1) in which the attribute should occur.

parameters

Parameters of the algorithm, including: - Parameters.ATTRIBUTE_KEY => the attribute to use (default: concept:name) - Parameters.KEEP_ONCE_PER_CASE => decides the level of the filter to apply (if the filter should be applied on the cases, set it to True).

Returns#

filtered_log

Filtered event log