pm4py.analysis module#

pm4py.analysis.construct_synchronous_product_net(trace: Trace, petri_net: PetriNet, initial_marking: Marking, final_marking: Marking) Tuple[PetriNet, Marking, Marking]#

Constructs the synchronous product net between a trace and a Petri net process model.

Parameters:
  • trace – A trace from an event log.

  • petri_net – The Petri net process model.

  • initial_marking – The initial marking of the Petri net.

  • final_marking – The final marking of the Petri net.

Returns:

A tuple containing the synchronous Petri net, the initial marking, and the final marking.

Return type:

Tuple[PetriNet, Marking, Marking]

import pm4py

net, im, fm = pm4py.read_pnml('model.pnml')
log = pm4py.read_xes('log.xes')
sync_net, sync_im, sync_fm = pm4py.construct_synchronous_product_net(log[0], net, im, fm)
pm4py.analysis.compute_emd(language1: Dict[List[str], float], language2: Dict[List[str], float]) float[source]#

Computes the Earth Mover Distance (EMD) between two stochastic languages. For example, one language may be extracted from a log, and the other from a process model.

Parameters:
  • language1 – The first stochastic language.

  • language2 – The second stochastic language.

Returns:

The computed Earth Mover Distance.

Return type:

float

import pm4py

log = pm4py.read_xes('tests/input_data/running-example.xes')
language_log = pm4py.get_stochastic_language(log)
print(language_log)
net, im, fm = pm4py.read_pnml('tests/input_data/running-example.pnml')
language_model = pm4py.get_stochastic_language(net, im, fm)
print(language_model)
emd_distance = pm4py.compute_emd(language_log, language_model)
print(emd_distance)
pm4py.analysis.solve_marking_equation(petri_net: PetriNet, initial_marking: Marking, final_marking: Marking, cost_function: Dict[Transition, float] = None) float[source]#

Solves the marking equation of a Petri net using an Integer Linear Programming (ILP) approach. An optional transition-based cost function can be provided to minimize the solution.

Parameters:
  • petri_net – The Petri net.

  • initial_marking – The initial marking of the Petri net.

  • final_marking – The final marking of the Petri net.

  • cost_function – (Optional) A dictionary mapping transitions to their associated costs. If not provided, a default cost of 1 is assigned to each transition.

Returns:

The heuristic value obtained by solving the marking equation.

Return type:

float

import pm4py

net, im, fm = pm4py.read_pnml('model.pnml')
heuristic = pm4py.solve_marking_equation(net, im, fm)
pm4py.analysis.solve_extended_marking_equation(trace: Trace, sync_net: PetriNet, sync_im: Marking, sync_fm: Marking, split_points: List[int] | None = None) float#

Computes a heuristic value (an underestimation of the cost of an alignment) between a trace and a synchronous product net using the extended marking equation with the standard cost function. For example, synchronization moves have a cost of 0, invisible moves have a cost of 1, and other moves on the model or log have a cost of 10,000. This method provides optimal provisioning of the split points.

Parameters:
  • trace – The trace to evaluate.

  • sync_net – The synchronous product net.

  • sync_im – The initial marking of the synchronous net.

  • sync_fm – The final marking of the synchronous net.

  • split_points – (Optional) The indices of the events in the trace to be used as split points. If not specified, the split points are identified automatically.

Returns:

The heuristic value representing the cost underestimation.

Return type:

float

import pm4py

net, im, fm = pm4py.read_pnml('model.pnml')
log = pm4py.read_xes('log.xes')
ext_mark_eq_heu = pm4py.solve_extended_marking_equation(log[0], net, im, fm)
pm4py.analysis.check_soundness(petri_net: PetriNet, initial_marking: Marking, final_marking: Marking, print_diagnostics: bool = False) Tuple[bool, Dict[str, Any]][source]#

Checks if a given Petri net is a sound Workflow net (WF-net).

A Petri net is a WF-net if and only if:
  • It has a unique source place.

  • It has a unique end place.

  • Every element in the WF-net is on a path from the source to the sink place.

A WF-net is sound if and only if:
  • It contains no live-locks.

  • It contains no deadlocks.

  • It is always possible to reach the final marking from any reachable marking.

For a formal definition of a sound WF-net, refer to: http://www.padsweb.rwth-aachen.de/wvdaalst/publications/p628.pdf

The returned tuple consists of:
  • A boolean indicating whether the Petri net is a sound WF-net.

  • A dictionary containing diagnostics collected while running WOFLAN, associating diagnostic names with their corresponding details.

Parameters:
  • petri_net – The Petri net to check.

  • initial_marking – The initial marking of the Petri net.

  • final_marking – The final marking of the Petri net.

  • print_diagnostics – If True, additional diagnostics will be printed during the execution of WOFLAN.

Returns:

A tuple containing a boolean indicating soundness and a dictionary of diagnostics.

Return type:

Tuple[bool, Dict[str, Any]]

import pm4py

net, im, fm = pm4py.read_pnml('model.pnml')
is_sound = pm4py.check_soundness(net, im, fm)
pm4py.analysis.cluster_log(log: EventLog | EventStream | DataFrame, sklearn_clusterer=None, activity_key: str = 'concept:name', timestamp_key: str = 'time:timestamp', case_id_key: str = 'case:concept:name') Generator[EventLog, None, None][source]#

Applies clustering to the provided event log by extracting profiles for the log’s traces and clustering them using a Scikit-Learn clusterer (default is K-Means with two clusters).

Parameters:
  • log – The event log to cluster.

  • sklearn_clusterer – (Optional) The Scikit-Learn clusterer to use. Default is KMeans with n_clusters=2, random_state=0, and n_init=”auto”.

  • activity_key – The key used to identify activities in the log.

  • timestamp_key – The key used to identify timestamps in the log.

  • case_id_key – The key used to identify case IDs in the log.

Returns:

A generator that yields clustered event logs as pandas DataFrames.

Return type:

Generator[pd.DataFrame, None, None]

import pm4py

for clust_log in pm4py.cluster_log(df):
    print(clust_log)
pm4py.analysis.insert_artificial_start_end(log: EventLog | DataFrame, activity_key: str = 'concept:name', timestamp_key: str = 'time:timestamp', case_id_key: str = 'case:concept:name', artificial_start='▶', artificial_end='■') EventLog | DataFrame[source]#

Inserts artificial start and end activities into an event log or a Pandas DataFrame.

Parameters:
  • log – The event log or Pandas DataFrame to modify.

  • activity_key – The attribute key used for activities.

  • timestamp_key – The attribute key used for timestamps.

  • case_id_key – The attribute key used to identify cases.

  • artificial_start – The symbol to use for the artificial start activity.

  • artificial_end – The symbol to use for the artificial end activity.

Returns:

The event log or Pandas DataFrame with artificial start and end activities inserted.

Return type:

Union[EventLog, pd.DataFrame]

import pm4py

dataframe = pm4py.insert_artificial_start_end(
    dataframe,
    activity_key='concept:name',
    case_id_key='case:concept:name',
    timestamp_key='time:timestamp'
)
pm4py.analysis.insert_case_service_waiting_time(log: EventLog | DataFrame, service_time_column: str = '@@service_time', sojourn_time_column: str = '@@sojourn_time', waiting_time_column: str = '@@waiting_time', activity_key: str = 'concept:name', timestamp_key: str = 'time:timestamp', case_id_key: str = 'case:concept:name', start_timestamp_key: str = 'time:timestamp') DataFrame[source]#

Inserts service time, waiting time, and sojourn time information for each case into a Pandas DataFrame.

Parameters:
  • log – The event log or Pandas DataFrame to modify.

  • service_time_column – The name of the column to store service times.

  • sojourn_time_column – The name of the column to store sojourn times.

  • waiting_time_column – The name of the column to store waiting times.

  • activity_key – The attribute key used for activities.

  • timestamp_key – The attribute key used for timestamps.

  • case_id_key – The attribute key used to identify cases.

  • start_timestamp_key – The attribute key used for the start timestamp of cases.

Returns:

A Pandas DataFrame with the inserted service, waiting, and sojourn time columns.

Return type:

pd.DataFrame

import pm4py

dataframe = pm4py.insert_case_service_waiting_time(
    dataframe,
    activity_key='concept:name',
    timestamp_key='time:timestamp',
    case_id_key='case:concept:name',
    start_timestamp_key='time:timestamp'
)
pm4py.analysis.insert_case_arrival_finish_rate(log: EventLog | DataFrame, arrival_rate_column: str = '@@arrival_rate', finish_rate_column: str = '@@finish_rate', activity_key: str = 'concept:name', timestamp_key: str = 'time:timestamp', case_id_key: str = 'case:concept:name', start_timestamp_key: str = 'time:timestamp') DataFrame[source]#

Inserts arrival and finish rate information for each case into a Pandas DataFrame.

The arrival rate is computed as the time difference between the start of the current case and the start of the previous case to start. The finish rate is computed as the time difference between the end of the current case and the end of the next case to finish.

Parameters:
  • log – The event log or Pandas DataFrame to modify.

  • arrival_rate_column – The name of the column to store arrival rates.

  • finish_rate_column – The name of the column to store finish rates.

  • activity_key – The attribute key used for activities.

  • timestamp_key – The attribute key used for timestamps.

  • case_id_key – The attribute key used to identify cases.

  • start_timestamp_key – The attribute key used for the start timestamp of cases.

Returns:

A Pandas DataFrame with the inserted arrival and finish rate columns.

Return type:

pd.DataFrame

import pm4py

dataframe = pm4py.insert_case_arrival_finish_rate(
    dataframe,
    activity_key='concept:name',
    timestamp_key='time:timestamp',
    case_id_key='case:concept:name',
    start_timestamp_key='time:timestamp'
)
pm4py.analysis.check_is_workflow_net(net: PetriNet) bool[source]#

Checks if the input Petri net satisfies the WF-net (Workflow net) conditions: 1. It has a unique source place. 2. It has a unique sink place. 3. Every node is on a path from the source to the sink.

Parameters:

net – The Petri net to check.

Returns:

True if the Petri net is a WF-net, False otherwise.

Return type:

bool

import pm4py

net = pm4py.read_pnml('model.pnml')
is_wfnet = pm4py.check_is_workflow_net(net)
pm4py.analysis.maximal_decomposition(net: PetriNet, im: Marking, fm: Marking) List[Tuple[PetriNet, Marking, Marking]][source]#

Calculates the maximal decomposition of an accepting Petri net into its maximal components.

Parameters:
  • net – The Petri net to decompose.

  • im – The initial marking of the Petri net.

  • fm – The final marking of the Petri net.

Returns:

A list of tuples, each containing a subnet Petri net, its initial marking, and its final marking.

Return type:

List[Tuple[PetriNet, Marking, Marking]]

import pm4py

net, im, fm = pm4py.read_pnml('model.pnml')
list_nets = pm4py.maximal_decomposition(net, im, fm)
for subnet, subim, subfm in list_nets:
    pm4py.view_petri_net(subnet, subim, subfm, format='svg')
pm4py.analysis.simplicity_petri_net(net: PetriNet, im: Marking, fm: Marking, variant: str | None = 'arc_degree') float[source]#

Computes the simplicity metric for a given Petri net model.

Three available approaches are supported: - Arc Degree Simplicity: Described in the paper “ProDiGen: Mining complete, precise and minimal structure process models with a genetic algorithm.” by Vázquez-Barreiros, Borja, Manuel Mucientes, and Manuel Lama. Information Sciences, 294 (2015): 315-333. - Extended Cardoso Metric: Described in the paper “Complexity Metrics for Workflow Nets” by Lassen, Kristian Bisgaard, and Wil MP van der Aalst. - Extended Cyclomatic Metric: Also described in the paper “Complexity Metrics for Workflow Nets” by Lassen, Kristian Bisgaard, and Wil MP van der Aalst.

Parameters:
  • net – The Petri net for which to compute simplicity.

  • im – The initial marking of the Petri net.

  • fm – The final marking of the Petri net.

  • variant – The simplicity metric variant to use (‘arc_degree’, ‘extended_cardoso’, ‘extended_cyclomatic’).

Returns:

The computed simplicity value.

Return type:

float

import pm4py

net, im, fm = pm4py.discover_petri_net_inductive(
    dataframe,
    activity_key='concept:name',
    case_id_key='case:concept:name',
    timestamp_key='time:timestamp'
)
simplicity = pm4py.simplicity_petri_net(net, im, fm, variant='arc_degree')
pm4py.analysis.generate_marking(net: PetriNet, place_or_dct_places: str | Place | Dict[str, int] | Dict[Place, int]) Marking[source]#

Generates a marking for a given Petri net based on specified places and token counts.

Parameters:
  • net – The Petri net for which to generate the marking.

  • place_or_dct_places – Specifies the places and their token counts for the marking. It can be: - A single PetriNet.Place object, which will have one token. - A string representing the name of a place, which will have one token. - A dictionary mapping PetriNet.Place objects to their respective number of tokens. - A dictionary mapping place names (strings) to their respective number of tokens.

Returns:

The generated Marking object.

Return type:

Marking

import pm4py

net, im, fm = pm4py.read_pnml('model.pnml')
marking = pm4py.generate_marking(net, {'source': 2})
pm4py.analysis.reduce_petri_net_invisibles(net: PetriNet) PetriNet[source]#

Reduces the number of invisible transitions in the provided Petri net.

Parameters:

net – The Petri net to be reduced.

Returns:

The reduced Petri net with fewer invisible transitions.

Return type:

PetriNet

import pm4py

net, im, fm = pm4py.read_pnml('model.pnml')
net = pm4py.reduce_petri_net_invisibles(net)
pm4py.analysis.reduce_petri_net_implicit_places(net: PetriNet, im: Marking, fm: Marking) Tuple[PetriNet, Marking, Marking][source]#

Reduces the number of implicit places in the provided Petri net.

Parameters:
  • net – The Petri net to be reduced.

  • im – The initial marking of the Petri net.

  • fm – The final marking of the Petri net.

Returns:

A tuple containing the reduced Petri net, its initial marking, and its final marking.

Return type:

Tuple[PetriNet, Marking, Marking]

import pm4py

net, im, fm = pm4py.read_pnml('model.pnml')
net, im, fm = pm4py.reduce_petri_net_implicit_places(net, im, fm)
pm4py.analysis.get_enabled_transitions(net: PetriNet, marking: Marking) Set[Transition][source]#

Retrieves the set of transitions that are enabled in a given marking of a Petri net.

Parameters:
  • net – The Petri net.

  • marking – The current marking of the Petri net.

Returns:

A set of transitions that are enabled in the provided marking.

Return type:

Set[PetriNet.Transition]

import pm4py

net, im, fm = pm4py.read_pnml('tests/input_data/running-example.pnml')
# Gets the transitions enabled in the initial marking
enabled_transitions = pm4py.get_enabled_transitions(net, im)