Social Network Analysis

In PM4Py, we provide support for various Social Network Analysis (SNA) metrics, as well as tools for the discovery of roles.

Handover of Work

The Handover of Work metric measures how often one individual is followed by another individual in the execution of a business process. To calculate this metric, you can use the following code:

                
            

You can then visualize the result using NetworkX or Pyvis:

                
            

Subcontracting

The Subcontracting metric calculates how often the work of one individual is interleaved with the work of another individual, only for it to eventually "return" to the original individual. To measure the subcontracting metric, you can use the following code:

                
            

Afterward, you can visualize the results using NetworkX or Pyvis:

                
            

Working Together

The Working Together metric calculates how often two individuals collaborate to resolve a process instance. To measure the Working Together metric, you can use the following code:

                
            

You can then visualize the results using NetworkX or Pyvis:

                
            

Similar Activities

The Similar Activities metric calculates how similar the work patterns are between two individuals. To measure the Similar Activities metric, you can use the following code:

                
            

You can then visualize the results using NetworkX or Pyvis:

                
            

Roles Discovery

A role is defined as a set of activities in the log that are executed by a similar (multi)set of resources. Essentially, it represents a specific function within an organization. Grouping activities into roles can help:

  • Understand which activities are performed by which roles,
  • Provide better insight into roles themselves (since the numerosity of resources for a single activity may not provide enough explanation).

Initially, each activity is considered a separate role, and it is associated with the multiset of its originators. Roles are then merged according to their similarity until no further merges are possible. To begin, you need to import a log:

                
            

Next, apply the role detection algorithm:

                
            

You can print the sets of activities grouped into roles by using the following code:

print([x[0] for x in roles])

Clustering (SNA Results)

After applying an SNA metric, clustering allows you to group resources connected by meaningful relationships within the given metric. For example:

  • Clustering the results of the "Working Together" metric groups individuals who frequently work together into the same cluster.
  • Clustering the results of the "Similar Activities" metric groups individuals who perform similar tasks into the same cluster.

We provide a method to generate a list of groups (where each group consists of a list of resources) from the results of an SNA metric. This can be applied as follows to the running-example log and the results of the "Similar Activities" metric:

                
            

Resource Profiles

Resource profiling in event logs is also possible. We implement the approach described in: Pika, Anastasiia, et al. "Mining resource profiles from event logs." ACM Transactions on Management Information Systems (TMIS) 8.1 (2017): 1-30. Essentially, the behavior of a resource can be measured over a period of time with various metrics described in the paper:

  • RBI 1.1 (Number of distinct activities): The number of distinct activities performed by a resource during a specific time interval [t1, t2),
  • RBI 1.3 (Activity frequency): The fraction of completions of a given activity by a resource during a specific time interval [t1, t2), compared to the total number of completions by the same resource during that interval,
  • RBI 2.1 (Activity completions): The number of activity instances completed by a resource during a given time slot,
  • RBI 2.2 (Case completions): The number of cases completed during a given time slot in which a resource was involved,
  • RBI 2.3 (Fraction case completion): The fraction of cases completed during a given time slot in which a resource was involved, compared to the total number of cases completed in that time slot,
  • RBI 2.4 (Average workload): The average number of activities started by a resource but not yet completed at a given moment in time,
  • RBI 3.1 (Multitasking): The fraction of active time during which a resource is involved in more than one activity,
  • RBI 4.3 (Average duration of an activity): The average duration of completed activity instances during a specific time slot,
  • RBI 4.4 (Average case duration): The average duration of completed cases during a specific time slot in which a given resource was involved,
  • RBI 5.1 (Interaction between two resources): The number of cases completed during a given time slot in which two specific resources were involved,
  • RBI 5.2 (Social position): The fraction of resources involved in the same cases as a given resource during a specific time slot, relative to the total number of resources active during that time slot.

The following example calculates these metrics starting from the running-example XES event log:

                
            

Organizational Mining

With event logs, we can identify groups of resources performing similar activities. As we have seen in previous sections, there are different ways to automatically detect these groups:

  • Discovering the "Similar Activities" metric and applying a clustering algorithm to find groups,
  • Applying the roles discovery algorithm (Burattin et al.).

Alternatively, an attribute might be present in the events, specifying the group that performed the task.

"Organizational mining" refers to the discovery of behavior-related information specific to an organizational group, such as identifying which activities are performed by the group.

We provide an implementation of the approach described in: Yang, Jing, et al. "OrgMining 2.0: A Novel Framework for Organizational Model Mining from Event Logs." arXiv preprint arXiv:2011.12445 (2020).

The approach provides descriptions of group-related metrics (local diagnostics), such as:

  • Group Relative Focus: Specifies how much a resource group performed a given type of work compared to the overall workload of the group. It measures the work diversification within the group.
  • Group Relative Stake: Specifies how much a given type of work was performed by a particular resource group among all groups, measuring the participation of different groups in a given task.
  • Group Coverage: Specifies the proportion of group members involved in a given type of work.
  • Group Member Contribution: Specifies the extent to which individual group members contribute to the work performed by the group.

The following example calculates these metrics using the receipt XES event log and shows how the information can be used, leveraging an attribute that specifies which group is performing the task:

                
            

Alternatively, you can use the apply_from_clustering_or_roles method, which takes the log as the first argument and the results of the clustering as the second argument.