pm4py.discovery.discover_log_skeleton#

pm4py.discovery.discover_log_skeleton(log: EventLog | DataFrame, noise_threshold: float = 0.0, activity_key: str = 'concept:name', timestamp_key: str = 'time:timestamp', case_id_key: str = 'case:concept:name') Dict[str, Any][source]#

Discovers a log skeleton from an event log.

A log skeleton is a declarative model which consists of six different constraints: - “directly_follows”: specifies for some activities some strict bounds on the activities directly-following. For example,

‘A should be directly followed by B’ and ‘B should be directly followed by C’.

  • “always_before”: specifies that some activities may be executed only if some other activities are executed somewhen before

    in the history of the case. For example, ‘C should always be preceded by A’

  • “always_after”: specifies that some activities should always trigger the execution of some other activities

    in the future history of the case. For example, ‘A should always be followed by C’

  • “equivalence”: specifies that a given couple of activities should happen with the same number of occurrences inside

    a case. For example, ‘B and C should always happen the same number of times’.

  • “never_together”: specifies that a given couple of activities should never happen together in the history of the case.

    For example, ‘there should be no case containing both C and D’.

  • “activ_occurrences”: specifies the allowed number of occurrences per activity:

    E.g. A is allowed to be executed 1 or 2 times, B is allowed to be executed 1 or 2 or 3 or 4 times.

Reference paper: Verbeek, H. M. W., and R. Medeiros de Carvalho. “Log skeletons: A classification approach to process discovery.” arXiv preprint arXiv:1806.08247 (2018).

Parameters:
  • log – event log / Pandas dataframe

  • noise_threshold (float) – noise threshold, acting as described in the paper.

  • activity_key (str) – attribute to be used for the activity

  • timestamp_key (str) – attribute to be used for the timestamp

  • case_id_key (str) – attribute to be used as case identifier

Return type:

Dict[str, Any]

import pm4py

log_skeleton = pm4py.discover_log_skeleton(dataframe, noise_threshold=0.1, activity_key='concept:name', case_id_key='case:concept:name', timestamp_key='time:timestamp')