pm4py.algo.organizational_mining.network_analysis.variants package#

PM4Py – A Process Mining Library for Python

Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.

Website: https://processintelligence.solutions Contact: info@processintelligence.solutions

Submodules#

pm4py.algo.organizational_mining.network_analysis.variants.dataframe module#

PM4Py – A Process Mining Library for Python

Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.

Website: https://processintelligence.solutions Contact: info@processintelligence.solutions

class pm4py.algo.organizational_mining.network_analysis.variants.dataframe.Parameters(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Bases: Enum

SORTING_COLUMN = 'sorting_column'#
INDEX_KEY = 'index_key'#
TIMESTAMP_KEY = 'pm4py:param:timestamp_key'#
IN_COLUMN = 'in_column'#
OUT_COLUMN = 'out_column'#
NODE_COLUMN_SOURCE = 'node_column_source'#
NODE_COLUMN_TARGET = 'node_column_target'#
EDGE_COLUMN = 'edge_column'#
INCLUDE_PERFORMANCE = 'include_performance'#
BUSINESS_HOURS = 'business_hours'#
BUSINESS_HOUR_SLOTS = 'business_hour_slots'#
WORKCALENDAR = 'workcalendar'#
TIMESTAMP_DIFF_COLUMN = 'timestamp_diff_column'#
EDGE_REFERENCE = 'edge_reference'#

Builds the network analysis from the results of the link analysis (internal method)

Parameters#

merged_df

Dataframe obtained from the link analysis

parameters
Parameters of the method, including:
  • Parameters.NODE_COLUMN_SOURCE => the attribute to be used for the node definition of the source event (default: the resource of the log, org:resource)

  • Parameters.NODE_COLUMN_TARGET => the attribute to be used for the node definition of the target event (default: the resource of the log, org:resource)

  • Parameters.EDGE_COLUMN => the attribute to be used for the edge definition (default: the activity of the log, concept:name)

  • Parameters.EDGE_REFERENCE => the event into which the edge attribute should be picked:
    • _out => the source event

    • _in => the target event

  • Parameters.TIMESTAMP_COLUMN => the timestamp column

  • Parameters.TIMESTAMP_DIFF_COLUMN => timestamp diff column

  • Parameters.INCLUDE_PERFORMANCE => considers the performance of the edge

  • Parameters.BUSINESS_HOURS => boolean value that enables the business hours

  • Parameters.BUSINESS_HOURS_SLOTS =>

work schedule of the company, provided as a list of tuples where each tuple represents one time slot of business hours. One slot i.e. one tuple consists of one start and one end time given in seconds since week start, e.g. [

(7 * 60 * 60, 17 * 60 * 60), ((24 + 7) * 60 * 60, (24 + 12) * 60 * 60), ((24 + 13) * 60 * 60, (24 + 17) * 60 * 60),

] meaning that business hours are Mondays 07:00 - 17:00 and Tuesdays 07:00 - 12:00 and 13:00 - 17:00

Returns#

network_analysis

Edges of the network analysis (first key: edge; second key: type; value: number of occurrences)

pm4py.algo.organizational_mining.network_analysis.variants.dataframe.apply(dataframe: DataFrame, parameters: Dict[Any, Any] | None = None) Dict[Tuple[str, str], Dict[str, Any]][source]#

Performs the network analysis on the provided dataframe

Parameters#

dataframe

Dataframe

parameters

Parameters of the method, including: - Parameters.SORTING_COLUMN => the column that should be used to sort the log - Parameters.IN_COLUMN => the target column of the link (default: the case identifier; events of the same case are linked) - Parameters.OUT_COLUMN => the source column of the link (default: the case identifier; events of the same case are linked) - Parameters.INDEX_KEY => the name for the index attribute in the log (inserted during the execution) - Parameters.NODE_COLUMN_SOURCE => the attribute to be used for the node definition of the source event (default: the resource of the log, org:resource) - Parameters.NODE_COLUMN_TARGET => the attribute to be used for the node definition of the target event (default: the resource of the log, org:resource) - Parameters.EDGE_COLUMN => the attribute to be used for the edge definition (default: the activity of the log, concept:name) - Parameters.EDGE_REFERENCE => the event into which the edge attribute should be picked:

  • _out => the source event

  • _in => the target event

  • Parameters.TIMESTAMP_COLUMN => the timestamp column

  • Parameters.TIMESTAMP_DIFF_COLUMN => timestamp diff column

  • Parameters.INCLUDE_PERFORMANCE => considers the performance of the edge

  • Parameters.BUSINESS_HOURS => boolean value that enables the business hours

  • Parameters.BUSINESS_HOURS_SLOTS =>

work schedule of the company, provided as a list of tuples where each tuple represents one time slot of business hours. One slot i.e. one tuple consists of one start and one end time given in seconds since week start, e.g. [

(7 * 60 * 60, 17 * 60 * 60), ((24 + 7) * 60 * 60, (24 + 12) * 60 * 60), ((24 + 13) * 60 * 60, (24 + 17) * 60 * 60),

] meaning that business hours are Mondays 07:00 - 17:00 and Tuesdays 07:00 - 12:00 and 13:00 - 17:00

Returns#

network_analysis

Edges of the network analysis (first key: edge; second key: type; value: number of occurrences)