pm4py.filtering.filter_trace_segments#

pm4py.filtering.filter_trace_segments(log: EventLog | DataFrame, admitted_traces: List[List[str]], positive: bool = True, activity_key: str = 'concept:name', timestamp_key: str = 'time:timestamp', case_id_key: str = 'case:concept:name') EventLog | DataFrame[source]#

Filters an event log based on a set of trace segments. A trace is a sequence of activities and “…” where: - “…” before an activity indicates that other activities can precede the given activity. - “…” after an activity indicates that other activities can follow the given activity.

Examples: - pm4py.filter_trace_segments(log, [[“A”, “B”]]) retains only cases with the exact process variant A,B. - pm4py.filter_trace_segments(log, [[”…”, “A”, “B”]]) retains only cases ending with activities A,B. - pm4py.filter_trace_segments(log, [[“A”, “B”, “…”]]) retains only cases starting with activities A,B. - pm4py.filter_trace_segments(log, [[”…”, “A”, “B”, “C”, “…”], [”…”, “D”, “E”, “F”, “…”]]) retains cases where:

  • At any point, there is A followed by B followed by C,

  • And at any other point, there is D followed by E followed by F.

Parameters:
  • log – Event log or Pandas DataFrame.

  • admitted_traces – Collection of trace segments to admit based on the criteria above.

  • positive (bool) – Boolean indicating whether to keep (if True) or discard (if False) the cases satisfying the filter.

  • activity_key (str) – Attribute to be used for the activity.

  • timestamp_key (str) – Attribute to be used for the timestamp.

  • case_id_key (str) – Attribute to be used as case identifier.

Returns:

Filtered event log or Pandas DataFrame.

import pm4py

log = pm4py.read_xes("tests/input_data/running-example.xes")

filtered_log = pm4py.filter_trace_segments(
    log,
    [["...", "check ticket", "decide", "reinitiate request", "..."]]
)
print(filtered_log)