pm4py.stats.split_by_process_variant#
- pm4py.stats.split_by_process_variant(log: EventLog | DataFrame, activity_key: str = 'concept:name', timestamp_key: str = 'time:timestamp', case_id_key: str = 'case:concept:name', variant_column: str = '@@variant_column', index_in_trace_column: str = '@@index_in_trace') Iterator[Tuple[Collection[str], DataFrame]] [source]#
Splits an event log into sub-dataframes for each process variant. The result is an iterator over the variants along with the sub-dataframes.
- Parameters:
log – Event log
activity_key (
str
) – attribute to be used for the activitytimestamp_key (
str
) – attribute to be used for the timestampcase_id_key (
str
) – attribute to be used as case identifiervariant_column (
str
) – name of the utility column that stores the variant’s tupleindex_in_trace_column (
str
) – name of the utility column that stores the index of the event in the case
- Return type:
Iterator[Tuple[Collection[str], pd.DataFrame]]
import pandas as pd import pm4py dataframe = pd.read_csv('tests/input_data/receipt.csv') dataframe = pm4py.format_dataframe(dataframe) for variant, subdf in pm4py.split_by_process_variant(dataframe): print(variant) print(subdf)