pm4py.ocel module#

The pm4py.ocel module contains the object-centric process mining features offered in pm4py.

pm4py.ocel.ocel_get_object_types(ocel: OCEL) List[str][source]#

Returns the list of object types contained in the object-centric event log (e.g., [“order”, “item”, “delivery”]).

Parameters:

ocel (OCEL) – Object-centric event log.

Returns:

List of object types.

Return type:

List[str]

import pm4py

object_types = pm4py.ocel_get_object_types(ocel)
pm4py.ocel.ocel_get_attribute_names(ocel: OCEL) List[str][source]#

Returns the list of attributes at the event and object levels of an object-centric event log (e.g., [“cost”, “amount”, “name”]).

Parameters:

ocel (OCEL) – Object-centric event log.

Returns:

List of attribute names.

Return type:

List[str]

import pm4py

attribute_names = pm4py.ocel_get_attribute_names(ocel)
pm4py.ocel.ocel_flattening(ocel: OCEL, object_type: str) DataFrame[source]#

Flattens the object-centric event log to a traditional event log based on a chosen object type. In the flattened log, the objects of the specified type are treated as cases, and each case contains the set of events related to that object. The flattened log follows the XES notations for case identifier, activity, and timestamp. Specifically: - “case:concept:name” is used for the case ID. - “concept:name” is used for the activity. - “time:timestamp” is used for the timestamp.

Parameters:
  • ocel (OCEL) – Object-centric event log.

  • object_type (str) – The object type to use as cases.

Returns:

Flattened traditional event log.

Return type:

pd.DataFrame

import pm4py

event_log = pm4py.ocel_flattening(ocel, 'items')
pm4py.ocel.ocel_object_type_activities(ocel: OCEL) Dict[str, Collection[str]][source]#

Returns the set of activities performed for each object type.

Parameters:

ocel (OCEL) – Object-centric event log.

Returns:

Dictionary mapping object types to their associated activities.

Return type:

Dict[str, Collection[str]]

import pm4py

ot_activities = pm4py.ocel_object_type_activities(ocel)
pm4py.ocel.ocel_objects_ot_count(ocel: OCEL) Dict[str, Dict[str, int]][source]#

Returns the count of related objects per type for each event.

Parameters:

ocel (OCEL) – Object-centric event log.

Returns:

Nested dictionary mapping events to object types and their counts.

Return type:

Dict[str, Dict[str, int]]

import pm4py

objects_ot_count = pm4py.ocel_objects_ot_count(ocel)
pm4py.ocel.ocel_temporal_summary(ocel: OCEL) DataFrame[source]#

Returns the temporal summary of an object-centric event log. The temporal summary aggregates all events that occur at the same timestamp and reports the list of activities and involved objects.

Parameters:

ocel (OCEL) – Object-centric event log.

Returns:

Temporal summary DataFrame.

Return type:

pd.DataFrame

import pm4py

temporal_summary = pm4py.ocel_temporal_summary(ocel)
pm4py.ocel.ocel_objects_summary(ocel: OCEL) DataFrame[source]#

Returns the objects summary of an object-centric event log.

Parameters:

ocel (OCEL) – Object-centric event log.

Returns:

Objects summary DataFrame containing lifecycle information and interacting objects.

Return type:

pd.DataFrame

import pm4py

objects_summary = pm4py.ocel_objects_summary(ocel)
pm4py.ocel.ocel_objects_interactions_summary(ocel: OCEL) DataFrame[source]#

Returns the objects interactions summary of an object-centric event log. The summary includes a row for every combination of (event, related object, other related object). Properties such as the activity of the event and the object types of the two related objects are included.

Parameters:

ocel (OCEL) – Object-centric event log.

Returns:

Objects interactions summary DataFrame.

Return type:

pd.DataFrame

import pm4py

interactions_summary = pm4py.ocel_objects_interactions_summary(ocel)
pm4py.ocel.discover_ocdfg(ocel: OCEL, business_hours: bool = False, business_hour_slots: List[Tuple[int, int]] | None = [(25200, 61200), (111600, 147600), (198000, 234000), (284400, 320400), (370800, 406800)]) Dict[str, Any][source]#

Discovers an Object-Centric Directly-Follows Graph (OC-DFG) from an object-centric event log.

Object-centric directly-follows multigraphs are a composition of directly-follows graphs for each object type. These graphs can be annotated with different metrics considering the entities of an object-centric event log (i.e., events, unique objects, total objects).

Reference paper: Berti, Alessandro, and Wil van der Aalst. “Extracting multiple viewpoint models from relational databases.” Data-Driven Process Discovery and Analysis. Springer, Cham, 2018. 24-51.

Parameters:
  • ocel (OCEL) – Object-centric event log.

  • business_hours (bool) – Enable the usage of business hours if set to True.

  • business_hour_slots (Optional[List[Tuple[int, int]]]) – Work schedule of the company, provided as a list of tuples where each tuple represents one time slot of business hours. Each tuple consists of a start and an end time given in seconds since week start, e.g., [(25200, 61200), (9072, 43200), (46800, 61200)] meaning that business hours are Mondays 07:00 - 17:00, Tuesdays 02:32 - 12:00, and Wednesdays 13:00 - 17:00.

Returns:

OC-DFG discovery result.

Return type:

Dict[str, Any]

import pm4py

ocdfg = pm4py.discover_ocdfg(ocel)
pm4py.ocel.discover_oc_petri_net(ocel: OCEL, inductive_miner_variant: str = 'im', diagnostics_with_tbr: bool = False) Dict[str, Any][source]#

Discovers an object-centric Petri net from the provided object-centric event log.

Reference paper: van der Aalst, Wil MP, and Alessandro Berti. “Discovering object-centric Petri nets.” Fundamenta Informaticae 175.1-4 (2020): 1-40.

Parameters:
  • ocel (OCEL) – Object-centric event log.

  • inductive_miner_variant (str) – Variant of the inductive miner to use (“im” for traditional; “imd” for the faster inductive miner directly-follows).

  • diagnostics_with_tbr (bool) – Enable the computation of diagnostics using token-based replay if set to True.

Returns:

Discovered object-centric Petri net.

Return type:

Dict[str, Any]

import pm4py

ocpn = pm4py.discover_oc_petri_net(ocel)
pm4py.ocel.discover_objects_graph(ocel: OCEL, graph_type: str = 'object_interaction') Set[Tuple[str, str]][source]#

Discovers an object graph from the provided object-centric event log.

Available graph types: - “object_interaction” - “object_descendants” - “object_inheritance” - “object_cobirth” - “object_codeath”

Parameters:
  • ocel (OCEL) – Object-centric event log.

  • graph_type (str) – Type of graph to consider. Options include “object_interaction”, “object_descendants”, “object_inheritance”, “object_cobirth”, “object_codeath”.

Returns:

Discovered object graph as a set of tuples.

Return type:

Set[Tuple[str, str]]

import pm4py

ocel = pm4py.read_ocel('trial.ocel')
obj_graph = pm4py.discover_objects_graph(ocel, graph_type='object_interaction')
pm4py.ocel.ocel_o2o_enrichment(ocel: OCEL, included_graphs: Collection[str] | None = None) OCEL[source]#

Enriches the OCEL with information inferred from graph computations by inserting them into the O2O relations.

Parameters:
  • ocel (OCEL) – Object-centric event log.

  • included_graphs (Optional[Collection[str]]) – Types of graphs to include, provided as a list or set of strings. Options include “object_interaction_graph”, “object_descendants_graph”, “object_inheritance_graph”, “object_cobirth_graph”, “object_codeath_graph”.

Returns:

Enriched object-centric event log.

Return type:

OCEL

import pm4py

ocel = pm4py.read_ocel('trial.ocel')
ocel = pm4py.ocel_o2o_enrichment(ocel)
print(ocel.o2o)
pm4py.ocel.ocel_e2o_lifecycle_enrichment(ocel: OCEL) OCEL[source]#

Enriches the OCEL with lifecycle-based information, indicating when an object is created, terminated, or has other types of relations, by updating the E2O relations.

Parameters:

ocel (OCEL) – Object-centric event log.

Returns:

Enriched object-centric event log with lifecycle information.

Return type:

OCEL

import pm4py

ocel = pm4py.read_ocel('trial.ocel')
ocel = pm4py.ocel_e2o_lifecycle_enrichment(ocel)
print(ocel.relations)
pm4py.ocel.sample_ocel_objects(ocel: OCEL, num_objects: int) OCEL[source]#

Returns a sampled object-centric event log containing a random subset of objects. Only events related to at least one of the sampled objects are included in the returned log. Note that this sampling may disrupt the relationships between different objects.

Parameters:
  • ocel (OCEL) – Object-centric event log.

  • num_objects (int) – Number of objects to include in the sampled event log.

Returns:

Sampled object-centric event log.

Return type:

OCEL

import pm4py

ocel = pm4py.read_ocel('trial.ocel')
sampled_ocel = pm4py.sample_ocel_objects(ocel, 50)  # Keeps only 50 random objects
pm4py.ocel.sample_ocel_connected_components(ocel: OCEL, connected_components: int = 1, max_num_events_per_cc: int = 9223372036854775807, max_num_objects_per_cc: int = 9223372036854775807, max_num_e2o_relations_per_cc: int = 9223372036854775807) OCEL[source]#

Returns a sampled object-centric event log containing a specified number of connected components. Users can also set maximum limits on the number of events, objects, and E2O relations per connected component.

Reference paper: Adams, Jan Niklas, et al. “Defining cases and variants for object-centric event data.” 2022 4th International Conference on Process Mining (ICPM). IEEE, 2022.

Parameters:
  • ocel (OCEL) – Object-centric event log.

  • connected_components (int) – Number of connected components to include in the sampled event log.

  • max_num_events_per_cc (int) – Maximum number of events allowed per connected component (default: sys.maxsize).

  • max_num_objects_per_cc (int) – Maximum number of objects allowed per connected component (default: sys.maxsize).

  • max_num_e2o_relations_per_cc (int) – Maximum number of event-to-object relationships allowed per connected component (default: sys.maxsize).

Returns:

Sampled object-centric event log containing the specified connected components.

Return type:

OCEL

import pm4py

ocel = pm4py.read_ocel('trial.ocel')
sampled_ocel = pm4py.sample_ocel_connected_components(ocel, 5)  # Keeps only 5 connected components
pm4py.ocel.ocel_drop_duplicates(ocel: OCEL) OCEL[source]#

Removes duplicate relations between events and objects that occur at the same time, have the same activity, and are linked to the same object identifier. This effectively cleans the OCEL by eliminating duplicate events.

Parameters:

ocel (OCEL) – Object-centric event log.

Returns:

Cleaned object-centric event log without duplicate relations.

Return type:

OCEL

import pm4py

ocel = pm4py.read_ocel('trial.ocel')
ocel = pm4py.ocel_drop_duplicates(ocel)
pm4py.ocel.ocel_merge_duplicates(ocel: OCEL, have_common_object: bool | None = False) OCEL[source]#

Merges events in the OCEL that have the same activity and timestamp. Optionally, ensures that the events being merged share a common object.

Parameters:
  • ocel (OCEL) – Object-centric event log.

  • have_common_object (Optional[bool]) – If set to True, only merges events that share a common object. Defaults to False.

Returns:

Object-centric event log with merged duplicate events.

Return type:

OCEL

import pm4py

ocel = pm4py.read_ocel('trial.ocel')
ocel = pm4py.ocel_merge_duplicates(ocel)
pm4py.ocel.ocel_sort_by_additional_column(ocel: OCEL, additional_column: str, primary_column: str = 'ocel:timestamp') OCEL[source]#

Sorts the OCEL based on the primary timestamp column and an additional column to determine the order of events occurring at the same timestamp.

Parameters:
  • ocel (OCEL) – Object-centric event log.

  • additional_column (str) – Additional column to use for sorting.

  • primary_column (str) – Primary column to use for sorting (default: “ocel:timestamp”). Typically the timestamp column.

Returns:

Sorted object-centric event log.

Return type:

OCEL

import pm4py

ocel = pm4py.read_ocel('trial.ocel')
ocel = pm4py.ocel_sort_by_additional_column(ocel, 'ordering')
pm4py.ocel.ocel_add_index_based_timedelta(ocel: OCEL) OCEL[source]#

Adds a small time delta to the timestamp column based on the event index to ensure the correct ordering of events within any object-centric process mining solution.

Parameters:

ocel (OCEL) – Object-centric event log.

Returns:

Object-centric event log with index-based time deltas added.

Return type:

OCEL

import pm4py

ocel = pm4py.read_ocel('trial.ocel')
ocel = pm4py.ocel_add_index_based_timedelta(ocel)
pm4py.ocel.cluster_equivalent_ocel(ocel: OCEL, object_type: str, max_objs: int = 9223372036854775807) Dict[str, Collection[OCEL]][source]#

Clusters the object-centric event log based on the ‘executions’ of a single object type. Equivalent ‘executions’ are grouped together in the output dictionary.

Parameters:
  • ocel (OCEL) – Object-centric event log.

  • object_type (str) – Reference object type for clustering.

  • max_objs (int) – Maximum number of objects (of the specified object type) to include per cluster. Defaults to sys.maxsize.

Returns:

Dictionary mapping cluster descriptions to collections of equivalent OCELs.

Return type:

Dict[str, Collection[OCEL]]

import pm4py

ocel = pm4py.read_ocel('trial.ocel')
clusters = pm4py.cluster_equivalent_ocel(ocel, "order")