pm4py.discovery.discover_performance_dfg#
- pm4py.discovery.discover_performance_dfg(log: EventLog | DataFrame, business_hours: bool = False, business_hour_slots=[(25200, 61200), (111600, 147600), (198000, 234000), (284400, 320400), (370800, 406800)], workcalendar=None, activity_key: str = 'concept:name', timestamp_key: str = 'time:timestamp', case_id_key: str = 'case:concept:name') Tuple[dict, dict, dict] [source]#
Discovers a Performance Directly-Follows Graph from an event log.
This method returns a tuple containing: - A dictionary with pairs of directly-following activities as keys and the performance metrics of the relationship as values. - A dictionary of start activities with their respective frequencies. - A dictionary of end activities with their respective frequencies.
- Parameters:
log – Event log or Pandas DataFrame.
business_hours (
bool
) – Enables or disables computation based on business hours (default: False).business_hour_slots –
Work schedule of the company, provided as a list of tuples where each tuple represents one time slot of business hours. Each slot consists of a start and end time given in seconds since the week start. Example: ```python [
(7 * 60 * 60, 17 * 60 * 60), # Monday 07:00 - 17:00 ((24 + 7) * 60 * 60, (24 + 12) * 60 * 60), # Tuesday 07:00 - 12:00 ((24 + 13) * 60 * 60, (24 + 17) * 60 * 60) # Tuesday 13:00 - 17:00
activity_key (
str
) – Attribute to be used for the activity (default: “concept:name”).timestamp_key (
str
) – Attribute to be used for the timestamp (default: “time:timestamp”).case_id_key (
str
) – Attribute to be used as case identifier (default: “case:concept:name”).
- Returns:
A tuple of three dictionaries: (performance_dfg, start_activities, end_activities).
- Return type:
Tuple[dict, dict, dict]
import pm4py performance_dfg, start_activities, end_activities = pm4py.discover_performance_dfg( dataframe, case_id_key='case:concept:name', activity_key='concept:name', timestamp_key='time:timestamp' )