pm4py.ml.extract_ocel_features#

pm4py.ml.extract_ocel_features(ocel: OCEL, obj_type: str, enable_object_lifecycle_paths: bool = True, enable_object_work_in_progress: bool = False, object_str_attributes: Collection[str] | None = None, object_num_attributes: Collection[str] | None = None, include_obj_id: bool = False, debug: bool = False) DataFrame[source]#

Extracts a set of features from an object-centric event log (OCEL) for objects of a specified type.

This function computes various features based on the lifecycle paths and work-in-progress metrics of objects within the OCEL. It also supports encoding of string and numeric object attributes.

The approach is based on: Berti, A., Herforth, J., Qafari, M.S. et al. Graph-based feature extraction on object-centric event logs. Int J Data Sci Anal (2023). https://doi.org/10.1007/s41060-023-00428-2

Parameters:
  • ocel (OCEL) – The object-centric event log from which to extract features.

  • obj_type (str) – The object type to consider for feature extraction.

  • enable_object_lifecycle_paths (bool) – Whether to enable the “lifecycle paths” feature.

  • enable_object_work_in_progress (bool) – Whether to enable the “work in progress” feature, which has a high computational cost.

  • object_str_attributes – (Optional) Collection of string attributes at the object level to one-hot encode.

  • object_num_attributes – (Optional) Collection of numeric attributes at the object level to encode.

  • include_obj_id (bool) – Whether to include the object identifier as a column in the features DataFrame.

  • debug (bool) – Whether to enable debugging mode to track the feature extraction process.

Returns:

A Pandas DataFrame containing the extracted features for the specified object type.

Return type:

pd.DataFrame

import pm4py

ocel = pm4py.read_ocel('log.jsonocel')
fea_df = pm4py.extract_ocel_features(ocel, "item")