pm4py.discovery.discover_performance_dfg#

pm4py.discovery.discover_performance_dfg(log: EventLog | DataFrame, business_hours: bool = False, business_hour_slots=[(25200, 61200), (111600, 147600), (198000, 234000), (284400, 320400), (370800, 406800)], workcalendar=None, activity_key: str = 'concept:name', timestamp_key: str = 'time:timestamp', case_id_key: str = 'case:concept:name') Tuple[dict, dict, dict][source]#

Discovers a Performance Directly-Follows Graph from an event log.

This method returns a tuple containing: - A dictionary with pairs of directly-following activities as keys and the performance metrics of the relationship as values. - A dictionary of start activities with their respective frequencies. - A dictionary of end activities with their respective frequencies.

Parameters:
  • log – Event log or Pandas DataFrame.

  • business_hours (bool) – Enables or disables computation based on business hours (default: False).

  • business_hour_slots

    Work schedule of the company, provided as a list of tuples where each tuple represents one time slot of business hours. Each slot consists of a start and end time given in seconds since the week start. Example: ```python [

    (7 * 60 * 60, 17 * 60 * 60), # Monday 07:00 - 17:00 ((24 + 7) * 60 * 60, (24 + 12) * 60 * 60), # Tuesday 07:00 - 12:00 ((24 + 13) * 60 * 60, (24 + 17) * 60 * 60) # Tuesday 13:00 - 17:00

  • activity_key (str) – Attribute to be used for the activity (default: “concept:name”).

  • timestamp_key (str) – Attribute to be used for the timestamp (default: “time:timestamp”).

  • case_id_key (str) – Attribute to be used as case identifier (default: “case:concept:name”).

Returns:

A tuple of three dictionaries: (performance_dfg, start_activities, end_activities).

Return type:

Tuple[dict, dict, dict]

import pm4py

performance_dfg, start_activities, end_activities = pm4py.discover_performance_dfg(
    dataframe,
    case_id_key='case:concept:name',
    activity_key='concept:name',
    timestamp_key='time:timestamp'
)