pm4py.algo.discovery.correlation_mining.variants package#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
Submodules#
pm4py.algo.discovery.correlation_mining.variants.classic module#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
- class pm4py.algo.discovery.correlation_mining.variants.classic.Parameters(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#
Bases:
Enum
- ACTIVITY_KEY = 'pm4py:param:activity_key'#
- TIMESTAMP_KEY = 'pm4py:param:timestamp_key'#
- START_TIMESTAMP_KEY = 'pm4py:param:start_timestamp_key'#
- EXACT_TIME_MATCHING = 'exact_time_matching'#
- INDEX_KEY = 'index_key'#
- pm4py.algo.discovery.correlation_mining.variants.classic.apply(log: EventLog | EventStream | DataFrame, parameters: Dict[str | Parameters, Any] | None = None) Tuple[Dict[Tuple[str, str], int], Dict[Tuple[str, str], float]] [source]#
Apply the correlation miner to an event stream (other types of logs are converted to that)
The approach is described in: Pourmirza, Shaya, Remco Dijkman, and Paul Grefen. “Correlation miner: mining business process models and event correlations without case identifiers.” International Journal of Cooperative Information Systems 26.02 (2017): 1742002.
Parameters#
- log
Log object
- parameters
Parameters of the algorithm
Returns#
- dfg
DFG
- performance_dfg
Performance DFG (containing the estimated performance for the arcs)
- pm4py.algo.discovery.correlation_mining.variants.classic.resolve_lp_get_dfg(PS_matrix, duration_matrix, activities, activities_counter)[source]#
Resolves a LP problem to get a DFG
Parameters#
- PS_matrix
Precede-succeed matrix
- duration_matrix
Duration matrix
- activities
List of activities of the log
- activities_counter
Counter of the activities
Returns#
- dfg
DFG
- performance_dfg
Performance DFG (containing the estimated performance for the arcs)
- pm4py.algo.discovery.correlation_mining.variants.classic.get_PS_dur_matrix(activities_grouped, activities, parameters=None)[source]#
Combined methods to get the two matrixes
Parameters#
- activities_grouped
Grouped activities
- activities
List of activities of the log
- parameters
Parameters of the algorithm
Returns#
- PS_matrix
Precede-succeed matrix
- duration_matrix
Duration matrix
- pm4py.algo.discovery.correlation_mining.variants.classic.preprocess_log(log, activities=None, parameters=None)[source]#
Preprocess a log to enable correlation mining
Parameters#
- log
Log object
- activities
(if provided) list of activities of the log
- parameters
Parameters of the algorithm
Returns#
- transf_stream
Transformed stream
- activities_grouped
Grouped activities
- activities
List of activities of the log
- pm4py.algo.discovery.correlation_mining.variants.classic.get_precede_succeed_matrix(activities, activities_grouped, timestamp_key, start_timestamp_key)[source]#
Calculates the precede succeed matrix
Parameters#
- activities
Ordered list of activities of the log
- activities_grouped
Grouped list of activities
- timestamp_key
Timestamp key
- start_timestamp_key
Start timestamp key (events start)
Returns#
- precede_succeed_matrix
Precede succeed matrix
- pm4py.algo.discovery.correlation_mining.variants.classic.get_duration_matrix(activities, activities_grouped, timestamp_key, start_timestamp_key, exact=False)[source]#
Calculates the duration matrix
Parameters#
- activities
Ordered list of activities of the log
- activities_grouped
Grouped list of activities
- timestamp_key
Timestamp key
- start_timestamp_key
Start timestamp key (events start)
- exact
Performs an exact matching of the times (True/False)
Returns#
- duration_matrix
Duration matrix
pm4py.algo.discovery.correlation_mining.variants.classic_split module#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
- class pm4py.algo.discovery.correlation_mining.variants.classic_split.Parameters(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#
Bases:
Enum
- ACTIVITY_KEY = 'pm4py:param:activity_key'#
- TIMESTAMP_KEY = 'pm4py:param:timestamp_key'#
- START_TIMESTAMP_KEY = 'pm4py:param:start_timestamp_key'#
- SAMPLE_SIZE = 'sample_size'#
- pm4py.algo.discovery.correlation_mining.variants.classic_split.apply(log: EventLog | EventStream | DataFrame, parameters: Dict[str | Parameters, Any] | None = None) Tuple[Dict[Tuple[str, str], int], Dict[Tuple[str, str], float]] [source]#
Applies the correlation miner (splits the log in smaller chunks)
Parameters#
- log
Log object
- parameters
Parameters of the algorithm
Returns#
- dfg
Frequency DFG
- performance_dfg
Performance DFG
pm4py.algo.discovery.correlation_mining.variants.trace_based module#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
- class pm4py.algo.discovery.correlation_mining.variants.trace_based.Parameters(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#
Bases:
Enum
- ACTIVITY_KEY = 'pm4py:param:activity_key'#
- TIMESTAMP_KEY = 'pm4py:param:timestamp_key'#
- START_TIMESTAMP_KEY = 'pm4py:param:start_timestamp_key'#
- CASE_ID_KEY = 'pm4py:param:case_id_key'#
- INDEX_KEY = 'index_key'#
- pm4py.algo.discovery.correlation_mining.variants.trace_based.apply(log: EventLog | EventStream | DataFrame, parameters: Dict[str | Parameters, Any] | None = None) Tuple[Dict[Tuple[str, str], int], Dict[Tuple[str, str], float]] [source]#
Novel approach of correlation mining, that creates the PS-matrix and the duration matrix using the order list of events of each trace of the log
Parameters#
- log
Event log
- parameters
Parameters
Returns#
- dfg
DFG
- performance_dfg
Performance DFG (containing the estimated performance for the arcs)
- pm4py.algo.discovery.correlation_mining.variants.trace_based.resolve_lp_get_dfg(PS_matrix, duration_matrix, activities, activities_counter)[source]#
Resolves a LP problem to get a DFG
Parameters#
- PS_matrix
Precede-succeed matrix
- duration_matrix
Duration matrix
- activities
List of activities of the log
- activities_counter
Counter for the activities of the log
Returns#
- dfg
Frequency DFG
- performance_dfg
Performance DFG
- pm4py.algo.discovery.correlation_mining.variants.trace_based.get_PS_duration_matrix(activities, trace_grouped_list, parameters=None)[source]#
Gets the precede-succeed matrix
Parameters#
- activities
Activities
- trace_grouped_list
Grouped list of simplified traces (per activity)
- parameters
Parameters of the algorithm
Returns#
- PS_matrix
precede-succeed matrix
- duration_matrix
Duration matrix
- pm4py.algo.discovery.correlation_mining.variants.trace_based.preprocess_log(log, activities=None, activities_counter=None, parameters=None)[source]#
Preprocess the log to get a grouped list of simplified traces (per activity)
Parameters#
- log
Log object
- activities
(if provided) activities of the log
- activities_counter
(if provided) counter of the activities of the log
- parameters
Parameters of the algorithm
Returns#
- traces_list
List of simplified traces of the log
- trace_grouped_list
Grouped list of simplified traces (per activity)
- activities
Activities of the log
- activities_counter
Activities counter
- pm4py.algo.discovery.correlation_mining.variants.trace_based.get_precede_succeed_matrix(activities, trace_grouped_list, timestamp_key, start_timestamp_key)[source]#
Calculates the precede succeed matrix
Parameters#
- activities
Sorted list of activities of the log
- trace_grouped_list
A list of lists of lists, containing for each trace and each activity the events having such activity
- timestamp_key
The key to be used as timestamp
- start_timestamp_key
The key to be used as start timestamp
Returns#
- mat
The precede succeed matrix
- pm4py.algo.discovery.correlation_mining.variants.trace_based.get_duration_matrix(activities, trace_grouped_list, timestamp_key, start_timestamp_key)[source]#
Calculates the duration matrix
Parameters#
- activities
Sorted list of activities of the log
- trace_grouped_list
A list of lists of lists, containing for each trace and each activity the events having such activity
- timestamp_key
The key to be used as timestamp
- start_timestamp_key
The key to be used as start timestamp
Returns#
- mat
The duration matrix