pm4py.ml.extract_ocel_features#
- pm4py.ml.extract_ocel_features(ocel: OCEL, obj_type: str, enable_object_lifecycle_paths: bool = True, enable_object_work_in_progress: bool = False, object_str_attributes: Collection[str] | None = None, object_num_attributes: Collection[str] | None = None, include_obj_id: bool = False, debug: bool = False) DataFrame [source]#
Extracts a set of features from an object-centric event log (OCEL) for objects of a specified type.
This function computes various features based on the lifecycle paths and work-in-progress metrics of objects within the OCEL. It also supports encoding of string and numeric object attributes.
The approach is based on: Berti, A., Herforth, J., Qafari, M.S. et al. Graph-based feature extraction on object-centric event logs. Int J Data Sci Anal (2023). https://doi.org/10.1007/s41060-023-00428-2
- Parameters:
ocel (
OCEL
) – The object-centric event log from which to extract features.obj_type (
str
) – The object type to consider for feature extraction.enable_object_lifecycle_paths (
bool
) – Whether to enable the “lifecycle paths” feature.enable_object_work_in_progress (
bool
) – Whether to enable the “work in progress” feature, which has a high computational cost.object_str_attributes – (Optional) Collection of string attributes at the object level to one-hot encode.
object_num_attributes – (Optional) Collection of numeric attributes at the object level to encode.
include_obj_id (
bool
) – Whether to include the object identifier as a column in the features DataFrame.debug (
bool
) – Whether to enable debugging mode to track the feature extraction process.
- Returns:
A Pandas DataFrame containing the extracted features for the specified object type.
- Return type:
pd.DataFrame
import pm4py ocel = pm4py.read_ocel('log.jsonocel') fea_df = pm4py.extract_ocel_features(ocel, "item")