pm4py.algo.filtering.dfg package#

PM4Py – A Process Mining Library for Python

Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.

Website: https://processintelligence.solutions Contact: info@processintelligence.solutions

Submodules#

pm4py.algo.filtering.dfg.dfg_filtering module#

PM4Py – A Process Mining Library for Python

Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.

Website: https://processintelligence.solutions Contact: info@processintelligence.solutions

pm4py.algo.filtering.dfg.dfg_filtering.generate_nx_graph_from_dfg(dfg, start_activities, end_activities, activities_count)[source]#

Generate a NetworkX graph for reachability-checking purposes out of the DFG

Parameters#

dfg

DFG

start_activities

Start activities

end_activities

End activities

activities_count

Activities of the DFG along with their count

Returns#

G

NetworkX digraph

start_node

Identifier of the start node (connected to all the start activities)

end_node

Identifier of the end node (connected to all the end activities)

pm4py.algo.filtering.dfg.dfg_filtering.filter_dfg_on_activities_percentage(dfg0, start_activities0, end_activities0, activities_count0, percentage)[source]#

Filters a DFG (complete, and so connected) on the specified percentage of activities (but ensuring that every node is still reachable from the start and to the end)

Parameters#

dfg0

(Complete, and so connected) DFG

start_activities0

Start activities

end_activities0

End activities

activities_count0

Activities of the DFG along with their count

percentage

Percentage of activities

Returns#

dfg

(Filtered) DFG

start_activities

(Filtered) start activities

end_activities

(Filtered) end activities

activities_count

(Filtered) activities of the DFG along with their count

pm4py.algo.filtering.dfg.dfg_filtering.filter_dfg_on_paths_percentage(dfg0, start_activities0, end_activities0, activities_count0, percentage, keep_all_activities=False)[source]#

Filters a DFG (complete, and so connected) on the specified percentage of paths (but ensuring that every node is still reachable from the start and to the end)

Parameters#

dfg0

(Complete, and so connected) DFG

start_activities0

Start activities

end_activities0

End activities

activities_count0

Activities of the DFG along with their count

percentage

Percentage of paths

keep_all_activities

Decides if all the activities (also the ones connected by the low occurrences edges) should be kept, or only the ones appearing in the edges with more occurrences (default).

Returns#

dfg

(Filtered) DFG

start_activities

(Filtered) start activities

end_activities

(Filtered) end activities

activities_count

(Filtered) activities of the DFG along with their count

pm4py.algo.filtering.dfg.dfg_filtering.filter_dfg_keep_connected(dfg0, start_activities0, end_activities0, activities_count0, threshold, keep_all_activities=False)[source]#

Filters a DFG (complete, and so connected) on the specified dependency threshold (Heuristics Miner dependency) (but ensuring that every node is still reachable from the start and to the end)

Parameters#

dfg0

(Complete, and so connected) DFG

start_activities0

Start activities

end_activities0

End activities

activities_count0

Activities of the DFG along with their count

threshold

Dependency threshold as in the Heuristics Miner

keep_all_activities

Decides if all the activities should be kept, or only the ones appearing in the edges with higher threshold (default).

Returns#

dfg

(Filtered) DFG

start_activities

(Filtered) start activities

end_activities

(Filtered) end activities

activities_count

(Filtered) activities of the DFG along with their count

pm4py.algo.filtering.dfg.dfg_filtering.filter_dfg_to_activity(dfg0, start_activities0, end_activities0, activities_count0, target_activity, parameters=None)[source]#

Filters the DFG, making “target_activity” the only possible end activity of the graph

Parameters#

dfg0

Directly-follows graph

start_activities0

Start activities

end_activities0

End activities

activities_count0

Activities count

target_activity

Target activity (only possible end activity after the filtering)

parameters

Parameters

Returns#

dfg

Filtered DFG

start_activities

Filtered start activities

end_activities

Filtered end activities

activities_count

Filtered activities count

pm4py.algo.filtering.dfg.dfg_filtering.filter_dfg_from_activity(dfg0, start_activities0, end_activities0, activities_count0, source_activity, parameters=None)[source]#

Filters the DFG, making “source_activity” the only possible source activity of the graph

Parameters#

dfg0

Directly-follows graph

start_activities0

Start activities

end_activities0

End activities

activities_count0

Activities count

source_activity

Source activity (only possible start activity after the filtering)

parameters

Parameters

Returns#

dfg

Filtered DFG

start_activities

Filtered start activities

end_activities

Filtered end activities

activities_count

Filtered activities count

pm4py.algo.filtering.dfg.dfg_filtering.filter_dfg_contain_activity(dfg0, start_activities0, end_activities0, activities_count0, activity, parameters=None)[source]#

Filters the DFG keeping only nodes that can reach / are reachable from activity

Parameters#

dfg0

Directly-follows graph

start_activities0

Start activities

end_activities0

End activities

activities_count0

Activities count

activity

Activity that should be reachable / should reach all the nodes of the filtered graph

parameters

Parameters

Returns#

dfg

Filtered DFG

start_activities

Filtered start activities

end_activities

Filtered end activities

activities_count

Filtered activities count

pm4py.algo.filtering.dfg.dfg_filtering.clean_dfg_based_on_noise_thresh(dfg, activities, noise_threshold, parameters=None)[source]#

Clean Directly-Follows graph based on noise threshold

Parameters#

dfg

Directly-Follows graph

activities

Activities in the DFG graph

noise_threshold

Noise threshold

Returns#

newDfg

Cleaned dfg based on noise threshold