pm4py.discovery.discover_log_skeleton#
- pm4py.discovery.discover_log_skeleton(log: EventLog | DataFrame, noise_threshold: float = 0.0, activity_key: str = 'concept:name', timestamp_key: str = 'time:timestamp', case_id_key: str = 'case:concept:name') Dict[str, Any] [source]#
Discovers a log skeleton from an event log.
A log skeleton is a declarative model which consists of six different constraints: - “directly_follows”: specifies for some activities some strict bounds on the activities directly-following. For example,
‘A should be directly followed by B’ and ‘B should be directly followed by C’.
- “always_before”: specifies that some activities may be executed only if some other activities are executed somewhen before
in the history of the case. For example, ‘C should always be preceded by A’
- “always_after”: specifies that some activities should always trigger the execution of some other activities
in the future history of the case. For example, ‘A should always be followed by C’
- “equivalence”: specifies that a given couple of activities should happen with the same number of occurrences inside
a case. For example, ‘B and C should always happen the same number of times’.
- “never_together”: specifies that a given couple of activities should never happen together in the history of the case.
For example, ‘there should be no case containing both C and D’.
- “activ_occurrences”: specifies the allowed number of occurrences per activity:
E.g. A is allowed to be executed 1 or 2 times, B is allowed to be executed 1 or 2 or 3 or 4 times.
Reference paper: Verbeek, H. M. W., and R. Medeiros de Carvalho. “Log skeletons: A classification approach to process discovery.” arXiv preprint arXiv:1806.08247 (2018).
- Parameters:
log – event log / Pandas dataframe
noise_threshold (
float
) – noise threshold, acting as described in the paper.activity_key (
str
) – attribute to be used for the activitytimestamp_key (
str
) – attribute to be used for the timestampcase_id_key (
str
) – attribute to be used as case identifier
- Return type:
Dict[str, Any]
import pm4py log_skeleton = pm4py.discover_log_skeleton(dataframe, noise_threshold=0.1, activity_key='concept:name', case_id_key='case:concept:name', timestamp_key='time:timestamp')