pm4py.statistics.traces.generic.log package#

PM4Py – A Process Mining Library for Python

Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.

Website: https://processintelligence.solutions Contact: info@processintelligence.solutions

Submodules#

pm4py.statistics.traces.generic.log.case_arrival module#

PM4Py – A Process Mining Library for Python

Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.

Website: https://processintelligence.solutions Contact: info@processintelligence.solutions

class pm4py.statistics.traces.generic.log.case_arrival.Parameters(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Bases: Enum

ATTRIBUTE_KEY = 'pm4py:param:attribute_key'#
ACTIVITY_KEY = 'pm4py:param:activity_key'#
TIMESTAMP_KEY = 'pm4py:param:timestamp_key'#
CASE_ID_KEY = 'pm4py:param:case_id_key'#
BUSINESS_HOURS = 'business_hours'#
BUSINESS_HOUR_SLOTS = 'business_hour_slots'#
WORKCALENDAR = 'workcalendar'#
pm4py.statistics.traces.generic.log.case_arrival.get_case_arrival_avg(log: EventLog, parameters: Dict[str | Parameters, Any] | None = None) float[source]#

Gets the average time interlapsed between case starts

Parameters#

log

Trace log

parameters
Parameters of the algorithm, including:

Parameters.TIMESTAMP_KEY -> attribute of the log to be used as timestamp

Returns#

case_arrival_avg

Average time interlapsed between case starts

pm4py.statistics.traces.generic.log.case_arrival.get_case_dispersion_avg(log: EventLog, parameters: Dict[str | Parameters, Any] | None = None) float[source]#

Gets the average time interlapsed between case ends

Parameters#

log

Trace log

parameters
Parameters of the algorithm, including:

Parameters.TIMESTAMP_KEY -> attribute of the log to be used as timestamp

Returns#

case_dispersion_avg

Average time interlapsed between the completion of cases

pm4py.statistics.traces.generic.log.case_statistics module#

PM4Py – A Process Mining Library for Python

Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.

Website: https://processintelligence.solutions Contact: info@processintelligence.solutions

class pm4py.statistics.traces.generic.log.case_statistics.Parameters(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Bases: Enum

ATTRIBUTE_KEY = 'pm4py:param:attribute_key'#
ACTIVITY_KEY = 'pm4py:param:activity_key'#
TIMESTAMP_KEY = 'pm4py:param:timestamp_key'#
CASE_ID_KEY = 'pm4py:param:case_id_key'#
MAX_VARIANTS_TO_RETURN = 'max_variants_to_return'#
VARIANTS = 'variants'#
VAR_DURATIONS = 'var_durations'#
ENABLE_SORT = 'enable_sort'#
SORT_BY_INDEX = 'sort_by_index'#
SORT_ASCENDING = 'sort_ascending'#
MAX_RET_CASES = 'max_ret_cases'#
BUSINESS_HOURS = 'business_hours'#
BUSINESS_HOUR_SLOTS = 'business_hour_slots'#
WORKCALENDAR = 'workcalendar'#
INDEXED_LOG = 'indexed_log'#
pm4py.statistics.traces.generic.log.case_statistics.get_variant_statistics(log: EventLog, parameters: Dict[str | Parameters, Any] | None = None) List[Dict[str, int]] | List[Dict[List[str], int]][source]#

Gets a dictionary whose key is the variant and as value there is the list of traces that share the variant

Parameters#

log

Log

parameters
Parameters of the algorithm, including:

Parameters.ACTIVITY_KEY -> Attribute identifying the activity in the log Parameters.MAX_VARIANTS_TO_RETURN -> Maximum number of variants to return Parameters.VARIANT -> If provided, avoid recalculation of the variants

Returns#

variants_list

List of variants along the statistics

pm4py.statistics.traces.generic.log.case_statistics.get_cases_description(log: EventLog, parameters: Dict[str | Parameters, Any] | None = None) Dict[str, Dict[str, Any]][source]#

Get a description of traces present in the log

Parameters#

log

Log

parameters

Parameters of the algorithm, including: Parameters.CASE_ID_KEY -> Trace attribute in which the case ID is contained Parameters.TIMESTAMP_KEY -> Column that identifies the timestamp Parameters.ENABLE_SORT -> Enable sorting of traces Parameters.SORT_BY_INDEX -> Sort the traces using this index:

0 -> case ID 1 -> start time 2 -> end time 3 -> difference

Parameters.SORT_ASCENDING -> Set sort direction (boolean; it true then the sort direction is ascending, otherwise descending) Parameters.MAX_RET_CASES -> Set the maximum number of returned traces

Returns#

ret

Dictionary of traces associated to their start timestamp, their end timestamp and their duration

pm4py.statistics.traces.generic.log.case_statistics.index_log_caseid(log, parameters=None)[source]#

Index a log according to case ID

Parameters#

log

Log object

parameters
Possible parameters of the algorithm, including:

Parameters.CASE_ID_KEY -> Trace attribute in which the Case ID is contained

Returns#

dict

Dictionary that has the case IDs as keys and the corresponding case as value

pm4py.statistics.traces.generic.log.case_statistics.get_events(log: EventLog, case_id: str, parameters: Dict[str | Parameters, Any] | None = None) List[Dict[str, Any]][source]#

Get events belonging to the specified case

Parameters#

log

Log object

case_id

Required case ID

parameters
Possible parameters of the algorithm, including:

Parameters.CASE_ID_KEY -> Trace attribute in which the case ID is contained Parameters.INDEXED_LOG -> Indexed log (if it has been calculated previously)

Returns#

list_eve

List of events belonging to the case

pm4py.statistics.traces.generic.log.case_statistics.get_all_case_durations(log: EventLog, parameters: Dict[str | Parameters, Any] | None = None) List[float][source]#

Gets all the case durations out of the log

Parameters#

log

Log object

parameters

Possible parameters of the algorithm

Returns#

duration_values

List of all duration values

pm4py.statistics.traces.generic.log.case_statistics.get_first_quartile_case_duration(log: EventLog, parameters: Dict[str | Parameters, Any] | None = None) float[source]#

Gets the first quartile out of the log

Parameters#

log

Log

parameters

Possible parameters of the algorithm

Returns#

value

First quartile value

pm4py.statistics.traces.generic.log.case_statistics.get_median_case_duration(log: EventLog, parameters: Dict[str | Parameters, Any] | None = None)[source]#

Gets the median case duration out of the log

Parameters#

log

Log

parameters

Possible parameters of the algorithm

Returns#

value

Median duration value

pm4py.statistics.traces.generic.log.case_statistics.get_kde_caseduration(log, parameters=None)[source]#

Gets the estimation of KDE density for the case durations calculated on the log

Parameters#

log

Log object

parameters
Possible parameters of the algorithm, including:

Parameters.GRAPH_POINTS -> number of points to include in the graph

Returns#

x

X-axis values to represent

y

Y-axis values to represent

pm4py.statistics.traces.generic.log.case_statistics.get_kde_caseduration_json(log, parameters=None)[source]#

Gets the estimation of KDE density for the case durations calculated on the log (expressed as JSON)

Parameters#

log

Log object

parameters
Possible parameters of the algorithm, including:

Parameters.GRAPH_POINTS -> number of points to include in the graph Parameters.CASE_ID_KEY -> Column hosting the Case ID

Returns#

json

JSON representing the graph points