pm4py.algo.filtering.dfg package#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
Submodules#
pm4py.algo.filtering.dfg.dfg_filtering module#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
- pm4py.algo.filtering.dfg.dfg_filtering.generate_nx_graph_from_dfg(dfg, start_activities, end_activities, activities_count)[source]#
Generate a NetworkX graph for reachability-checking purposes out of the DFG
Parameters#
- dfg
DFG
- start_activities
Start activities
- end_activities
End activities
- activities_count
Activities of the DFG along with their count
Returns#
- G
NetworkX digraph
- start_node
Identifier of the start node (connected to all the start activities)
- end_node
Identifier of the end node (connected to all the end activities)
- pm4py.algo.filtering.dfg.dfg_filtering.filter_dfg_on_activities_percentage(dfg0, start_activities0, end_activities0, activities_count0, percentage)[source]#
Filters a DFG (complete, and so connected) on the specified percentage of activities (but ensuring that every node is still reachable from the start and to the end)
Parameters#
- dfg0
(Complete, and so connected) DFG
- start_activities0
Start activities
- end_activities0
End activities
- activities_count0
Activities of the DFG along with their count
- percentage
Percentage of activities
Returns#
- dfg
(Filtered) DFG
- start_activities
(Filtered) start activities
- end_activities
(Filtered) end activities
- activities_count
(Filtered) activities of the DFG along with their count
- pm4py.algo.filtering.dfg.dfg_filtering.filter_dfg_on_paths_percentage(dfg0, start_activities0, end_activities0, activities_count0, percentage, keep_all_activities=False)[source]#
Filters a DFG (complete, and so connected) on the specified percentage of paths (but ensuring that every node is still reachable from the start and to the end)
Parameters#
- dfg0
(Complete, and so connected) DFG
- start_activities0
Start activities
- end_activities0
End activities
- activities_count0
Activities of the DFG along with their count
- percentage
Percentage of paths
- keep_all_activities
Decides if all the activities (also the ones connected by the low occurrences edges) should be kept, or only the ones appearing in the edges with more occurrences (default).
Returns#
- dfg
(Filtered) DFG
- start_activities
(Filtered) start activities
- end_activities
(Filtered) end activities
- activities_count
(Filtered) activities of the DFG along with their count
- pm4py.algo.filtering.dfg.dfg_filtering.filter_dfg_keep_connected(dfg0, start_activities0, end_activities0, activities_count0, threshold, keep_all_activities=False)[source]#
Filters a DFG (complete, and so connected) on the specified dependency threshold (Heuristics Miner dependency) (but ensuring that every node is still reachable from the start and to the end)
Parameters#
- dfg0
(Complete, and so connected) DFG
- start_activities0
Start activities
- end_activities0
End activities
- activities_count0
Activities of the DFG along with their count
- threshold
Dependency threshold as in the Heuristics Miner
- keep_all_activities
Decides if all the activities should be kept, or only the ones appearing in the edges with higher threshold (default).
Returns#
- dfg
(Filtered) DFG
- start_activities
(Filtered) start activities
- end_activities
(Filtered) end activities
- activities_count
(Filtered) activities of the DFG along with their count
- pm4py.algo.filtering.dfg.dfg_filtering.filter_dfg_to_activity(dfg0, start_activities0, end_activities0, activities_count0, target_activity, parameters=None)[source]#
Filters the DFG, making “target_activity” the only possible end activity of the graph
Parameters#
- dfg0
Directly-follows graph
- start_activities0
Start activities
- end_activities0
End activities
- activities_count0
Activities count
- target_activity
Target activity (only possible end activity after the filtering)
- parameters
Parameters
Returns#
- dfg
Filtered DFG
- start_activities
Filtered start activities
- end_activities
Filtered end activities
- activities_count
Filtered activities count
- pm4py.algo.filtering.dfg.dfg_filtering.filter_dfg_from_activity(dfg0, start_activities0, end_activities0, activities_count0, source_activity, parameters=None)[source]#
Filters the DFG, making “source_activity” the only possible source activity of the graph
Parameters#
- dfg0
Directly-follows graph
- start_activities0
Start activities
- end_activities0
End activities
- activities_count0
Activities count
- source_activity
Source activity (only possible start activity after the filtering)
- parameters
Parameters
Returns#
- dfg
Filtered DFG
- start_activities
Filtered start activities
- end_activities
Filtered end activities
- activities_count
Filtered activities count
- pm4py.algo.filtering.dfg.dfg_filtering.filter_dfg_contain_activity(dfg0, start_activities0, end_activities0, activities_count0, activity, parameters=None)[source]#
Filters the DFG keeping only nodes that can reach / are reachable from activity
Parameters#
- dfg0
Directly-follows graph
- start_activities0
Start activities
- end_activities0
End activities
- activities_count0
Activities count
- activity
Activity that should be reachable / should reach all the nodes of the filtered graph
- parameters
Parameters
Returns#
- dfg
Filtered DFG
- start_activities
Filtered start activities
- end_activities
Filtered end activities
- activities_count
Filtered activities count
- pm4py.algo.filtering.dfg.dfg_filtering.clean_dfg_based_on_noise_thresh(dfg, activities, noise_threshold, parameters=None)[source]#
Clean Directly-Follows graph based on noise threshold
Parameters#
- dfg
Directly-Follows graph
- activities
Activities in the DFG graph
- noise_threshold
Noise threshold
Returns#
- newDfg
Cleaned dfg based on noise threshold