pm4py.filtering.filter_variants#

pm4py.filtering.filter_variants(log: EventLog | DataFrame, variants: Set[str] | List[str] | List[Tuple[str]], retain: bool = True, activity_key: str = 'concept:name', timestamp_key: str = 'time:timestamp', case_id_key: str = 'case:concept:name') EventLog | DataFrame[source]#

Filter a log on a specified set of variants

Parameters:
  • log – event log / Pandas dataframe

  • variants – collection of variants to filter; A variant should be specified as a list of tuples of activity names, e.g., [(‘a’, ‘b’, ‘c’)]

  • retain (bool) – boolean; if True all traces conforming to the specified variants are retained; if False, all those traces are removed

  • activity_key (str) – attribute to be used for the activity

  • timestamp_key (str) – attribute to be used for the timestamp

  • case_id_key (str) – attribute to be used as case identifier

Return type:

Union[EventLog, pd.DataFrame]

import pm4py

filtered_dataframe = pm4py.filter_variants(dataframe, [('Act. A', 'Act. B', 'Act. Z'), ('Act. A', 'Act. C', 'Act. Z')], activity_key='concept:name', case_id_key='case:concept:name', timestamp_key='time:timestamp')