pm4py.algo.decision_mining package#

PM4Py – A Process Mining Library for Python

Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.

Website: https://processintelligence.solutions Contact: info@processintelligence.solutions

Submodules#

pm4py.algo.decision_mining.algorithm module#

PM4Py – A Process Mining Library for Python

Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.

Website: https://processintelligence.solutions Contact: info@processintelligence.solutions

class pm4py.algo.decision_mining.algorithm.Parameters(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Bases: Enum

ACTIVITY_KEY = 'pm4py:param:activity_key'#
LABELS = 'labels'#
pm4py.algo.decision_mining.algorithm.create_data_petri_nets_with_decisions(log: EventLog | DataFrame, net: PetriNet, initial_marking: Marking, final_marking: Marking) Tuple[PetriNet, Marking, Marking][source]#

Given a Petri net, create a data Petri net with the decisions given for each place by the decision mining algorithm.

Parameters#

log

Event log (EventLog or DataFrame).

net

Petri net.

initial_marking

Initial marking of the Petri net.

final_marking

Final marking of the Petri net.

Returns#

data_petri_net

Petri net enriched with guards (conditions).

initial_marking

Initial marking (unchanged).

final_marking

Final marking (unchanged).

pm4py.algo.decision_mining.algorithm.get_decision_tree(log: EventLog | DataFrame, net: PetriNet, initial_marking: Marking, final_marking: Marking, decision_point=None, attributes=None, parameters: Dict[str | Parameters, Any] | None = None) Any[source]#

Gets a decision tree classifier on a specific point of the model.

Parameters#

log

Event log (EventLog or DataFrame).

net

Petri net.

initial_marking

Initial marking.

final_marking

Final marking.

decision_point

Name of the place in which a decision happens: - if not specified, the method raises an Exception with a list of possible decision points.

attributes

Attributes of the log. If not specified, an automatic attribute selection is performed.

parameters

Parameters of the algorithm.

Returns#

clf

Fitted decision tree classifier.

feature_names

The names of the features used to fit the classifier.

classes

The classes (i.e., transitions) the classifier distinguishes.

pm4py.algo.decision_mining.algorithm.apply(log: EventLog | DataFrame, net: PetriNet, initial_marking: Marking, final_marking: Marking, decision_point=None, attributes=None, parameters: Dict[str | Parameters, Any] | None = None) Any[source]#

Gets the essential information (features, target class, and names of the target class) in order to learn a classifier.

Parameters#

log

Event log (EventLog or DataFrame).

net

Petri net.

initial_marking

Initial marking.

final_marking

Final marking.

decision_point

The name of the place in which a decision happens. - If not specified, raises an Exception with a list of possible places.

attributes

Attributes of the log. If not specified, an automatic attribute selection is performed.

parameters

Parameters of the algorithm.

Returns#

X

DataFrame of features.

y

Series of encoded target classes (integer).

class_name

Mapping of integer class -> actual transition name.

pm4py.algo.decision_mining.algorithm.get_decisions_table(log0, net, initial_marking, final_marking, attributes=None, use_trace_attributes=False, k=1, pre_decision_points=None, trace_attributes=None, parameters=None)[source]#

Builds a decision table out of a log and an accepting Petri net.

For each place that has multiple outgoing arcs (a “decision point”), we record the attributes that preceded the choice of a particular transition.

Parameters#

log0

Event log (EventLog or DataFrame).

net

Petri net.

initial_marking

Initial marking.

final_marking

Final marking.

attributes

List of event attributes to consider (if not provided, all are considered).

use_trace_attributes

Whether to include trace attributes (e.g., case-level data) in the decision table.

k

Number of last events to look back at for each decision. (Default=1)

pre_decision_points

List of place names that should be considered. If None, the code infers them automatically.

trace_attributes

List of trace attribute names to consider. If None, all are considered (if use_trace_attributes=True).

parameters

Additional parameters (e.g., {Parameters.LABELS: True/False}).

Returns#

I

A dictionary keyed by place name. Values are lists of tuples (dict_of_attributes, chosen_transition).

decision_points

The dictionary of decision points (places with multiple outgoing arcs), possibly filtered by pre_decision_points.

pm4py.algo.decision_mining.algorithm.prepare_event_log(log)[source]#

If trace attributes are considered, we want to differentiate them from event attributes. For trace attributes, we prepend “t_”. For event attributes, we prepend “e_”.

This helps avoid collisions when both trace and event attributes share the same name.

Parameters#

logEventLog

The original log.

Returns#

EventLog

The modified log with attribute names prefixed.

pm4py.algo.decision_mining.algorithm.prepare_attributes(attributes)[source]#

If trace attributes are considered, we assume all the user-provided attributes refer to event attributes and prepend “e_” to them.

Parameters#

attributeslist

List of original attribute names.

Returns#

list

List of attribute names, each prefixed by “e_”.

pm4py.algo.decision_mining.algorithm.get_decision_points(net, labels=False, pre_decision_points=None, parameters=None)[source]#

Identifies “decision points” in the net, i.e., places with >= 2 outgoing arcs.

Parameters#

netPetriNet

The Petri net under analysis.

labelsbool

Whether to list the labels of transitions as values rather than the raw transition names.

pre_decision_pointslist or None

If provided, only return decision points that appear in this list (filter).

parametersdict

(Unused in this function except for consistency.)

Returns#

dict

A dictionary mapping place_name -> list of outgoing transition names or labels.

pm4py.algo.decision_mining.algorithm.get_attributes(log, decision_points, attributes, use_trace_attributes, trace_attributes, k, net, initial_marking, final_marking, decision_points_names, parameters=None)[source]#

For each decision place, this collects the final table of (attributes -> chosen transition) for each occurrence of a decision.

This function internally uses token-based replay (or alignment for non-fitting traces) to discover the actual transitions that were used from the log. Then, for each place with multiple outgoing arcs, we store the attributes that led to a certain chosen transition.

Parameters#

logEventLog

The event log.

decision_pointsdict

Dictionary mapping place_name -> list of possible transitions (IDs/names).

attributeslist

Attributes to consider from events.

use_trace_attributesbool

Whether to consider trace-level attributes as well.

trace_attributeslist

List of trace-level attributes to consider.

kint

Number of events to look back at each decision (the “window size”).

netPetriNet

The Petri net.

initial_markingMarking

Initial marking.

final_markingMarking

Final marking.

decision_points_namesdict

Dictionary mapping place_name -> list of transition labels (if labels=True).

parametersdict

Additional parameters (e.g. {Parameters.LABELS: True/False}).

Returns#

dict

A dictionary keyed by place name, with each value a list of tuples: (attributes_dict, chosen_transition).

pm4py.algo.decision_mining.algorithm.encode_target(df, target_column)[source]#

Adds a ‘Target’ column to df with integer-encoded classes derived from an existing column (target_column).

Method adapted from: http://chrisstrelioff.ws/sandbox/2015/06/08/decision_trees_in_python_with_scikit_learn_and_pandas.html

Parameters#

dfpd.DataFrame

The DataFrame containing the target_column.

target_columnstr

The name of the column to map to integer classes.

Returns#

(df_mod, targets)

df_mod is the modified DataFrame with a ‘Target’ column. targets is the list of unique target names in their mapped order.