pm4py.analysis.cluster_log#

pm4py.analysis.cluster_log(log: EventLog | EventStream | DataFrame, sklearn_clusterer=None, activity_key: str = 'concept:name', timestamp_key: str = 'time:timestamp', case_id_key: str = 'case:concept:name') Generator[EventLog, None, None][source]#

Apply clustering to the provided event log (method based on the extraction of profiles for the traces of the event log) based on a Scikit-Learn clusterer (default: K-means with two clusters)

Parameters:
  • log – log object

  • sklearn_clusterer – the Scikit-Learn clusterer to be used (default: KMeans(n_clusters=2, random_state=0, n_init=”auto”))

  • activity_key (str) – attribute to be used for the activity

  • timestamp_key (str) – attribute to be used for the timestamp

  • case_id_key (str) – attribute to be used as case identifier

Return type:

Generator[pd.DataFrame, None, None]

import pm4py

for clust_log in pm4py.cluster_log(df):
    print(clust_log)