pm4py.algo.filtering.pandas.paths package#

PM4Py – A Process Mining Library for Python

Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.

Website: https://processintelligence.solutions Contact: info@processintelligence.solutions

Submodules#

pm4py.algo.filtering.pandas.paths.paths_filter module#

PM4Py – A Process Mining Library for Python

Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.

Website: https://processintelligence.solutions Contact: info@processintelligence.solutions

class pm4py.algo.filtering.pandas.paths.paths_filter.Parameters(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Bases: Enum

CASE_ID_KEY = 'pm4py:param:case_id_key'#
ATTRIBUTE_KEY = 'pm4py:param:attribute_key'#
TIMESTAMP_KEY = 'pm4py:param:timestamp_key'#
TARGET_ATTRIBUTE_KEY = 'target_attribute_key'#
DECREASING_FACTOR = 'decreasingFactor'#
POSITIVE = 'positive'#
MIN_PERFORMANCE = 'min_performance'#
MAX_PERFORMANCE = 'max_performance'#
pm4py.algo.filtering.pandas.paths.paths_filter.apply(df: DataFrame, paths: List[Tuple[str, str]], parameters: Dict[str | Parameters, Any] | None = None) DataFrame[source]#

Apply a filter on traces containing / not containing a path

Parameters#

df

Dataframe

paths

Paths to filter on

parameters
Possible parameters of the algorithm, including:

Parameters.CASE_ID_KEY -> Case ID column in the dataframe Parameters.ATTRIBUTE_KEY -> Attribute we want to filter Parameters.POSITIVE -> Specifies if the filter should be applied including traces (positive=True) or excluding traces (positive=False)

Returns#

df

Filtered dataframe

pm4py.algo.filtering.pandas.paths.paths_filter.apply_performance(df: DataFrame, provided_path: Tuple[str, str], parameters: Dict[str | Parameters, Any] | None = None) DataFrame[source]#

Filters the cases of a dataframe where there is at least one occurrence of the provided path occurring in the defined timedelta range.

Parameters#

df

Dataframe

paths

Paths to filter on

parameters
Possible parameters of the algorithm, including:

Parameters.CASE_ID_KEY -> Case ID column in the dataframe Parameters.ATTRIBUTE_KEY -> Attribute we want to filter Parameters.TIMESTAMP_KEY -> Attribute identifying the timestamp in the log Parameters.POSITIVE -> Specifies if the filter should be applied including traces (positive=True) or excluding traces (positive=False) Parameters.MIN_PERFORMANCE -> Minimal allowed performance of the provided path Parameters.MAX_PERFORMANCE -> Maximal allowed performance of the provided path

Returns#

df

Filtered dataframe