pm4py.filtering.filter_log_relative_occurrence_event_attribute#

pm4py.filtering.filter_log_relative_occurrence_event_attribute(log: EventLog | DataFrame, min_relative_stake: float, attribute_key: str = 'concept:name', level: str = 'cases', timestamp_key: str = 'time:timestamp', case_id_key: str = 'case:concept:name') EventLog | DataFrame[source]#

Filters the event log, keeping only the events that have an attribute value which occurs: - in at least the specified (min_relative_stake) percentage of events when level=”events”, - in at least the specified (min_relative_stake) percentage of cases when level=”cases”.

Parameters:
  • log – Event log or Pandas DataFrame.

  • min_relative_stake (float) – Minimum percentage of cases (expressed as a number between 0 and 1) in which the attribute should occur.

  • attribute_key (str) – The attribute to filter.

  • level (str) – The level of the filter (if level=”events”, then events; if level=”cases”, then cases).

  • timestamp_key (str) – Attribute to be used for the timestamp.

  • case_id_key (str) – Attribute to be used as case identifier.

Returns:

Filtered event log or Pandas DataFrame.

import pm4py

filtered_dataframe = pm4py.filter_log_relative_occurrence_event_attribute(
    dataframe,
    0.5,
    attribute_key='concept:name',
    level='cases',
    case_id_key='case:concept:name',
    timestamp_key='time:timestamp'
)