pm4py.analysis.cluster_log#
- pm4py.analysis.cluster_log(log: EventLog | EventStream | DataFrame, sklearn_clusterer=None, activity_key: str = 'concept:name', timestamp_key: str = 'time:timestamp', case_id_key: str = 'case:concept:name') Generator[EventLog, None, None] [source]#
Apply clustering to the provided event log (method based on the extraction of profiles for the traces of the event log) based on a Scikit-Learn clusterer (default: K-means with two clusters)
- Parameters:
log – log object
sklearn_clusterer – the Scikit-Learn clusterer to be used (default: KMeans(n_clusters=2, random_state=0, n_init=”auto”))
activity_key (
str
) – attribute to be used for the activitytimestamp_key (
str
) – attribute to be used for the timestampcase_id_key (
str
) – attribute to be used as case identifier
- Return type:
Generator[pd.DataFrame, None, None]
import pm4py for clust_log in pm4py.cluster_log(df): print(clust_log)