pm4py.ml.extract_target_vector#

pm4py.ml.extract_target_vector(log: EventLog | DataFrame, variant: str, activity_key: str = 'concept:name', timestamp_key: str = 'time:timestamp', case_id_key: str = 'case:concept:name') Tuple[Any, List[str]][source]#

Extracts the target vector from a log object for a specific machine learning use case.

Supported variants include: - ‘next_activity’: Predicts the next activity in a case. - ‘next_time’: Predicts the timestamp of the next activity. - ‘remaining_time’: Predicts the remaining time for the case.

Parameters:
  • log – The event log or Pandas DataFrame from which to extract the target vector.

  • variant (str) – The variant of the algorithm to use. Must be one of: ‘next_activity’, ‘next_time’, ‘remaining_time’.

  • activity_key (str) – Attribute to be used as the activity identifier.

  • timestamp_key (str) – Attribute to be used for timestamps.

  • case_id_key (str) – Attribute to be used as the case identifier.

Returns:

A tuple containing the target vector and a list of class labels (if applicable).

Return type:

Tuple[Any, List[str]]

Raises:

Exception – If an unsupported variant is provided.

import pm4py

vector_next_act, class_next_act = pm4py.extract_target_vector(
    log,
    'next_activity',
    activity_key='concept:name',
    timestamp_key='time:timestamp',
    case_id_key='case:concept:name'
)
vector_next_time, class_next_time = pm4py.extract_target_vector(
    log,
    'next_time',
    activity_key='concept:name',
    timestamp_key='time:timestamp',
    case_id_key='case:concept:name'
)
vector_rem_time, class_rem_time = pm4py.extract_target_vector(
    log,
    'remaining_time',
    activity_key='concept:name',
    timestamp_key='time:timestamp',
    case_id_key='case:concept:name'
)