pm4py.filtering.filter_trace_segments#

pm4py.filtering.filter_trace_segments(log: EventLog | DataFrame, admitted_traces: List[List[str]], positive: bool = True, activity_key: str = 'concept:name', timestamp_key: str = 'time:timestamp', case_id_key: str = 'case:concept:name') EventLog | DataFrame[source]#

Filters an event log on a set of traces. A trace is a sequence of activities and “…”, in which: - a “…” before an activity tells that other activities can precede the given activity - a “…” after an activity tells that other activities can follow the given activity

For example: - pm4py.filter_trace_segments(log, [[“A”, “B”]]) <- filters only the cases of the event log having exactly the process variant A,B - pm4py.filter_trace_segments(log, [[”…”, “A”, “B”]]) <- filters only the cases of the event log ending with the activities A,B - pm4py.filter_trace_segments(log, [[“A”, “B”, “…”]]) <- filters only the cases of the event log starting with the activities A,B - pm4py.filter_trace_segments(log, [[”…”, “A”, “B”, “C”, “…”], [”…”, “D”, “E”, “F”, “…”]]

<- filters only the cases of the event log in which at any point

there is A followed by B followed by C, and in which at any other point there is D followed by E followed by F

Parameters:
  • log – event log / Pandas dataframe

  • admitted_traces – collection of traces admitted from the filter (with the aforementioned criteria)

  • positive (bool) – (boolean) indicates if the filter should keep/discard the cases satisfying the filter

  • activity_key (str) – attribute to be used for the activity

  • timestamp_key (str) – attribute to be used for the timestamp

  • case_id_key (str) – attribute to be used as case identifier

Return type:

Union[EventLog, pd.DataFrame]

import pm4py

log = pm4py.read_xes("tests/input_data/running-example.xes")

filtered_log = pm4py.filter_trace_segments(log, [["...", "check ticket", "decide", "reinitiate request", "..."]])
print(filtered_log)