pm4py.discovery.discover_log_skeleton#
- pm4py.discovery.discover_log_skeleton(log: EventLog | DataFrame, noise_threshold: float = 0.0, activity_key: str = 'concept:name', timestamp_key: str = 'time:timestamp', case_id_key: str = 'case:concept:name') Dict[str, Any] [source]#
Discovers a Log Skeleton from an event log.
A Log Skeleton is a declarative model consisting of six different constraints: - directly_follows: Specifies strict bounds on activities that directly follow each other. Example: ‘A should be directly followed by B’ and ‘B should be directly followed by C’. - always_before: Specifies that some activities may only be executed if certain other activities have been executed earlier in the case. Example: ‘C should always be preceded by A’. - always_after: Specifies that certain activities should always trigger the execution of some other activities later in the case. Example: ‘A should always be followed by C’. - equivalence: Specifies that a given pair of activities should occur the same number of times within a case. Example: ‘B and C should always occur the same number of times’. - never_together: Specifies that a given pair of activities should never occur together in a case. Example: ‘There should be no case containing both C and D’. - activ_occurrences: Specifies allowed numbers of occurrences per activity. Example: ‘Activity A can occur 1 or 2 times, and Activity B can occur 1 to 4 times’.
Reference paper: Verbeek, H. M. W., and R. Medeiros de Carvalho. “Log skeletons: A classification approach to process discovery.” arXiv preprint arXiv:1806.08247 (2018).
- Parameters:
log – Event log or Pandas DataFrame.
noise_threshold (
float
) – Noise threshold influencing the strictness of constraints (default: 0.0).activity_key (
str
) – Attribute to be used for the activity (default: “concept:name”).timestamp_key (
str
) – Attribute to be used for the timestamp (default: “time:timestamp”).case_id_key (
str
) – Attribute to be used as case identifier (default: “case:concept:name”).
- Returns:
A dictionary representing the Log Skeleton with various constraints.
- Return type:
Dict[str, Any]
import pm4py log_skeleton = pm4py.discover_log_skeleton( dataframe, noise_threshold=0.1, activity_key='concept:name', case_id_key='case:concept:name', timestamp_key='time:timestamp' )