pm4py.filtering.filter_variants_by_coverage_percentage#
- pm4py.filtering.filter_variants_by_coverage_percentage(log: EventLog | DataFrame, min_coverage_percentage: float, activity_key: str = 'concept:name', timestamp_key: str = 'time:timestamp', case_id_key: str = 'case:concept:name') EventLog | DataFrame [source]#
Filters the variants of the log based on a coverage percentage. For example, if min_coverage_percentage=0.4 and the log has 1000 cases with: - 500 cases of variant 1, - 400 cases of variant 2, - 100 cases of variant 3, the filter keeps only the traces of variant 1 and variant 2.
- Parameters:
log – Event log or Pandas DataFrame.
min_coverage_percentage (
float
) – Minimum allowed percentage of coverage.activity_key (
str
) – Attribute to be used for the activity.timestamp_key (
str
) – Attribute to be used for the timestamp.case_id_key (
str
) – Attribute to be used as case identifier.
- Returns:
Filtered event log or Pandas DataFrame.
import pm4py filtered_dataframe = pm4py.filter_variants_by_coverage_percentage( dataframe, 0.1, activity_key='concept:name', timestamp_key='time:timestamp', case_id_key='case:concept:name' )