pm4py.statistics.chaotic_activities.variants.niek_sidorova module#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
- class pm4py.statistics.chaotic_activities.variants.niek_sidorova.Parameters(*values)[source]#
Bases:
Enum- ACTIVITY_KEY = 'pm4py:param:activity_key'#
- ALPHA = 'alpha'#
- pm4py.statistics.chaotic_activities.variants.niek_sidorova.apply(log: DataFrame | EventLog, parameters: Dict[Any, Any] | None = None) List[Dict[str, Any]][source]#
Compute information–theoretic metrics used to detect chaotic activities in an event log, as defined in:
Tax, Niek, Natalia Sidorova, and Wil MP van der Aalst. “Discovering more precise process models from event logs by filtering out chaotic activities.” Journal of Intelligent Information Systems 52.1 (2019): 107-139.
The result maps each activity to:
freq – absolute frequency #(a,L)
entropy – H(a,L) (direct entropy)
entropy_smooth – Hₛ(a,L) (Laplace‑smoothed entropy)
entropy_gain – ΔH (drop in total log‑entropy if a is removed)
chaotic_score – simple aggregate = (entropy_smooth+entropy_gain)/2
- Parameters:
log – Event log or Pandas dataframe
parameters – Variant-specific parameters, including:
Parameters.ALPHA: Laplace/Lidstone smoothing parameter α. None reproduces the raw entropy H(a,L); a typical choice following the paper is
α = 1/|A|.Parameters.ACTIVITY_KEY: the attribute to be used as activity. Default: “concept:name”
- Returns:
List of dictionaries, each representing an activity, sorted decreasingly based on the chaotic score.
- Return type:
chaotic_activities
- pm4py.statistics.chaotic_activities.variants.niek_sidorova.chaotic_metrics(traces, alpha=None)[source]#
- Parameters:
traces (list[list[str]]) – The event log where each inner list is a trace (ordered events).
alpha (float | None) – Laplace/Lidstone smoothing parameter α. None reproduces the raw entropy H(a,L); a typical choice following the paper is
α = 1/|A|.
- Return type: