pm4py.filtering.filter_paths_performance#

pm4py.filtering.filter_paths_performance(log: EventLog | DataFrame, path: Tuple[str, str], min_performance: float, max_performance: float, keep=True, activity_key: str = 'concept:name', timestamp_key: str = 'time:timestamp', case_id_key: str = 'case:concept:name') EventLog | DataFrame[source]#

Filters the event log, either: - (keep=True) keeping the cases having the specified path (tuple of 2 activities) with a duration included between min_performance and max_performance - (keep=False) discarding the cases having the specified path with a duration included between min_performance and max_performance

Parameters:
  • log – event log / Pandas dataframe

  • path – tuple of two activities (source_activity, target_activity)

  • min_performance (float) – minimum allowed performance (of the path)

  • max_performance (float) – maximum allowed performance (of the path)

  • keep (bool) – keep/discard the cases having the specified path with a duration included between min_performance and max_performance

  • activity_key (str) – attribute to be used for the activity

  • timestamp_key (str) – attribute to be used for the timestamp

  • case_id_key (str) – attribute to be used as case identifier

Return type:

Union[EventLog, pd.DataFrame]

import pm4py

filtered_dataframe = pm4py.filter_paths_performance(dataframe, ('A', 'D'), 3600.0, 86400.0, activity_key='concept:name', timestamp_key='time:timestamp', case_id_key='case:concept:name')