pm4py.ml.split_train_test#

pm4py.ml.split_train_test(log: EventLog | DataFrame, train_percentage: float = 0.8, case_id_key='case:concept:name') Tuple[EventLog, EventLog] | Tuple[DataFrame, DataFrame][source]#

Split an event log in a training log and a test log (for machine learning purposes). Returns the training and the test event log.

Parameters:
  • log – event log / Pandas dataframe

  • train_percentage (float) – fraction of traces to be included in the training log (from 0.0 to 1.0)

  • case_id_key (str) – attribute to be used as case identifier

Return type:

Union[Tuple[EventLog, EventLog], Tuple[pd.DataFrame, pd.DataFrame]]

import pm4py

train_df, test_df = pm4py.split_train_test(dataframe, train_percentage=0.75)