pm4py.objects.log.util package#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
Submodules#
pm4py.objects.log.util.activities_to_alphabet module#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
- class pm4py.objects.log.util.activities_to_alphabet.Parameters(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#
Bases:
Enum
- ACTIVITY_KEY = 'activity_key'#
- RETURN_MAPPING = 'return_mapping'#
- pm4py.objects.log.util.activities_to_alphabet.apply(dataframe: DataFrame, parameters: Dict[Any, Any] | None = None) DataFrame | Tuple[DataFrame, Dict[str, str]] [source]#
Remap the activities in a dataframe using an augmented alphabet to minimize the size of the encoding
Running example:
import pm4py from pm4py.objects.log.util import activities_to_alphabet from pm4py.util import constants
dataframe = pm4py.read_xes(“tests/input_data/running-example.xes”) renamed_dataframe = activities_to_alphabet.apply(dataframe, parameters={constants.PARAMETER_CONSTANT_ACTIVITY_KEY: “concept:name”}) print(renamed_dataframe)
Parameters#
- dataframe
Pandas dataframe
- parameters
Parameters of the method, including: - Parameters.ACTIVITY_KEY => attribute to be used as activity - Parameters.RETURN_MAPPING => (boolean) enables the returning the mapping dictionary (so the original activities can be re-constructed)
Returns#
- ren_dataframe
Pandas dataframe in which the activities have been remapped to the (augmented) alphabet
- inv_mapping
(if required) Dictionary associating to every letter of the (augmented) alphabet the original activity
pm4py.objects.log.util.artificial module#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
- class pm4py.objects.log.util.artificial.Parameters(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#
Bases:
Enum
- ACTIVITY_KEY = 'pm4py:param:activity_key'#
- TIMESTAMP_KEY = 'pm4py:param:timestamp_key'#
- PARAM_ARTIFICIAL_START_ACTIVITY = 'pm4py:param:art_start_act'#
- PARAM_ARTIFICIAL_END_ACTIVITY = 'pm4py:param:art_end_act'#
- pm4py.objects.log.util.artificial.insert_artificial_start_end(log: EventLog, parameters: Dict[Any, Any] | None = None) EventLog [source]#
Inserts the artificial start/end activities in an event log
Parameters#
- log
Event log
- parameters
Parameters of the algorithm, including: - Parameters.ACTIVITY_KEY: the activity - Parameters.TIMESTAMP_KEY: the timestamp
Returns#
- log
Enriched log
pm4py.objects.log.util.basic_filter module#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
- class pm4py.objects.log.util.basic_filter.Parameters(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#
Bases:
Enum
- ATTRIBUTE_KEY = 'pm4py:param:attribute_key'#
- POSITIVE = 'positive'#
- pm4py.objects.log.util.basic_filter.filter_log_events_attr(log, values, parameters=None)[source]#
Filter log by keeping only events with an attribute value that belongs to the provided values list
Parameters#
- log
log
- values
Allowed attributes
- parameters
- Parameters of the algorithm, including:
activity_key -> Attribute identifying the activity in the log positive -> Indicate if events should be kept/removed
Returns#
- filtered_log
Filtered log
- pm4py.objects.log.util.basic_filter.filter_log_traces_attr(log, values, parameters=None)[source]#
Filter log by keeping only traces that has/has not events with an attribute value that belongs to the provided values list
Parameters#
- log
Trace log
- values
Allowed attributes
- parameters
- Parameters of the algorithm, including:
activity_key -> Attribute identifying the activity in the log positive -> Indicate if events should be kept/removed
Returns#
- filtered_log
Filtered log
pm4py.objects.log.util.dataframe_utils module#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
- class pm4py.objects.log.util.dataframe_utils.Parameters(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#
Bases:
Enum
- PARTITION_COLUMN = 'partition_column'#
- CASE_ID_KEY = 'pm4py:param:case_id_key'#
- CASE_PREFIX = 'case:'#
- CASE_ATTRIBUTES = 'case_attributes'#
- MANDATORY_ATTRIBUTES = 'mandatory_attributes'#
- MAX_NO_CASES = 'max_no_cases'#
- MIN_DIFFERENT_OCC_STR_ATTR = 5#
- MAX_DIFFERENT_OCC_STR_ATTR = 50#
- TIMESTAMP_KEY = 'pm4py:param:timestamp_key'#
- ACTIVITY_KEY = 'pm4py:param:activity_key'#
- PARAM_ARTIFICIAL_START_ACTIVITY = 'pm4py:param:art_start_act'#
- PARAM_ARTIFICIAL_END_ACTIVITY = 'pm4py:param:art_end_act'#
- INDEX_KEY = 'index_key'#
- CASE_INDEX_KEY = 'case_index_key'#
- USE_EXTREMES_TIMESTAMP = 'use_extremes_timestamp'#
- ADD_CASE_IDENTIFIER_COLUMN = 'add_case_identifier_column'#
- DETERMINISTIC = 'deterministic'#
- pm4py.objects.log.util.dataframe_utils.insert_partitioning(df, num_partitions, parameters=None)[source]#
Insert the partitioning in the specified dataframe
Parameters#
- df
Dataframe
- num_partitions
Number of partitions
- parameters
Parameters of the algorithm
Returns#
- df
Partitioned dataframe
- pm4py.objects.log.util.dataframe_utils.legacy_parquet_support(df, parameters=None)[source]#
For legacy support, Parquet files columns could not contain a “:” that has been arbitrarily replaced by a replacer string. This string substitutes the replacer to the :
Parameters#
- dataframe
Dataframe
- parameters
Parameters of the algorithm
- pm4py.objects.log.util.dataframe_utils.table_to_stream(table, parameters=None)[source]#
Converts a Pyarrow table to an event stream
Parameters#
- table
Pyarrow table
- parameters
Possible parameters of the algorithm
- pm4py.objects.log.util.dataframe_utils.table_to_log(table, parameters=None)[source]#
Converts a Pyarrow table to an event log
Parameters#
- table
Pyarrow table
- parameters
Possible parameters of the algorithm
- pm4py.objects.log.util.dataframe_utils.convert_timestamp_columns_in_df(df, timest_format=None, timest_columns=None)[source]#
Convert all dataframe columns in a dataframe
Parameters#
- df
Dataframe
- timest_format
(If provided) Format of the timestamp columns in the CSV file
- timest_columns
Columns of the CSV that shall be converted into timestamp
Returns#
- df
Dataframe with timestamp columns converted
- pm4py.objects.log.util.dataframe_utils.sample_dataframe(df, parameters=None)[source]#
Sample a dataframe on a given number of cases
Parameters#
- df
Dataframe
- parameters
Parameters of the algorithm, including: - Parameters.CASE_ID_KEY - Parameters.CASE_ID_TO_RETAIN
Returns#
- sampled_df
Sampled dataframe
- pm4py.objects.log.util.dataframe_utils.automatic_feature_selection_df(df, parameters=None)[source]#
Performs an automatic feature selection on dataframes, keeping the features useful for ML purposes
Parameters#
- df
Dataframe
- parameters
Parameters of the algorithm
Returns#
- featured_df
Dataframe with only the features that have been selected
- pm4py.objects.log.util.dataframe_utils.select_number_column(df: DataFrame, fea_df: DataFrame, col: str, case_id_key='case:concept:name') DataFrame [source]#
Extract a column for the features dataframe for the given numeric attribute
Parameters#
- df
Dataframe
- fea_df
Feature dataframe
- col
Numeric column
- case_id_key
Case ID key
Returns#
- fea_df
Feature dataframe (desidered output)
- pm4py.objects.log.util.dataframe_utils.select_string_column(df: DataFrame, fea_df: DataFrame, col: str, case_id_key='case:concept:name') DataFrame [source]#
Extract N columns (for N different attribute values; hotencoding) for the features dataframe for the given string attribute
Parameters#
- df
Dataframe
- fea_df
Feature dataframe
- col
String column
- case_id_key
Case ID key
Returns#
- fea_df
Feature dataframe (desidered output)
- pm4py.objects.log.util.dataframe_utils.get_features_df(df: DataFrame, list_columns: List[str], parameters: Dict[Any, Any] | None = None) DataFrame [source]#
Given a dataframe and a list of columns, performs an automatic feature extraction
Parameters#
- df
Dataframe
- list_column
List of column to consider in the feature extraction
- parameters
Parameters of the algorithm, including: - Parameters.CASE_ID_KEY: the case ID
Returns#
- fea_df
Feature dataframe (desidered output)
- pm4py.objects.log.util.dataframe_utils.automatic_feature_extraction_df(df: DataFrame, parameters: Dict[Any, Any] | None = None) DataFrame [source]#
Performs an automatic feature extraction given a dataframe
Parameters#
- df
Dataframe
- parameters
Parameters of the algorithm, including: - Parameters.CASE_ID_KEY: the case ID - Parameters.MIN_DIFFERENT_OCC_STR_ATTR - Parameters.MAX_DIFFERENT_OCC_STR_ATTR
Returns#
- fea_df
Dataframe with the features
- pm4py.objects.log.util.dataframe_utils.insert_artificial_start_end(df0: DataFrame, parameters: Dict[Any, Any] | None = None) DataFrame [source]#
Inserts the artificial start/end activities in a Pandas dataframe
Parameters#
- df0
Dataframe
- parameters
Parameters of the algorithm, including: - Parameters.CASE_ID_KEY: the case identifier - Parameters.TIMESTAMP_KEY: the timestamp - Parameters.ACTIVITY_KEY: the activity
Returns#
- enriched_df
Dataframe with artificial start/end activities
- pm4py.objects.log.util.dataframe_utils.dataframe_to_activity_case_table(df: DataFrame, parameters: Dict[Any, Any] | None = None)[source]#
Transforms a Pandas dataframe into: - an “activity” table, containing the events and their attributes - a “case” table, containing the cases and their attributes
Parameters#
- df
Dataframe
- parameters
Parameters of the algorithm that should be used, including: - Parameters.CASE_ID_KEY => the column to be used as case ID (shall be included both in the activity table and the case table) - Parameters.CASE_PREFIX => if a list of attributes at the case level is not provided, then all the ones of the dataframe
starting with one of these are considered.
Parameters.CASE_ATTRIBUTES => the attributes of the dataframe to be used as case columns
Returns#
- activity_table
Activity table
- case_table
Case table
pm4py.objects.log.util.filtering_utils module#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
pm4py.objects.log.util.get_class_representation module#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
- pm4py.objects.log.util.get_class_representation.get_class_representation_by_str_ev_attr_value_presence(log, str_attr_name, str_attr_value)[source]#
Get the representation for the target part of the decision tree learning if the focus is on the presence of a given value of a (string) event attribute
Parameters#
- log
Trace log
- str_attr_name
Attribute name to consider
- str_attr_value
Attribute value to consider
Returns#
- target
Target part for decision tree learning
- classes
Name of the classes, in order
- pm4py.objects.log.util.get_class_representation.get_class_representation_by_str_ev_attr_value_value(log, str_attr_name)[source]#
Get the representation for the target part of the decision tree learning if the focus is on all (string) values of an event attribute
Parameters#
- log
Trace log
- str_attr_name
Attribute name to consider
Returns#
- target
Target part for decision tree learning
- classes
Name of the classes, in order
- pm4py.objects.log.util.get_class_representation.get_class_representation_by_trace_duration(log, target_trace_duration, timestamp_key='time:timestamp', parameters=None)[source]#
Get class representation by splitting traces according to trace duration
Parameters#
- log
Trace log
- target_trace_duration
Target trace duration
- timestamp_key
Timestamp key
Returns#
- target
Target part for decision tree learning
- classes
Name of the classes, in order
pm4py.objects.log.util.get_log_encoded module#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
- pm4py.objects.log.util.get_log_encoded.get_log_encoded(event_log, trace_attributes=[], event_attributes=[], concatenate=False)[source]#
Get event log encoded into matrix.
Parameters#
- event_log
Trace log
- trace_attributes
Attributes of the trace to be encoded
- event_attributes
Attributes of the events to be encoded
- concatenate
Boolean indicating if to generate all sub-sequences of events in a trace
Returns#
- dataset
A numpy matrix with the event log
- columns
The names of the columns in the dataset
pm4py.objects.log.util.get_prefixes module#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
- pm4py.objects.log.util.get_prefixes.get_prefixes_from_log(log: EventLog, length: int) EventLog [source]#
Gets the prefixes of a log of a given length
Parameters#
- log
Event log
- length
Length
Returns#
- prefix_log
Log contain the prefixes: - if a trace has lower or identical length, it is included as-is - if a trace has greater length, it is cut
- pm4py.objects.log.util.get_prefixes.get_log_with_log_prefixes(log, parameters=None)[source]#
Gets an extended log that contains, in order, all the prefixes for a case of the original log
Parameters#
- log
Original log
- parameters
Possible parameters of the algorithm
Returns#
- all_prefixes_log
Log with all the prefixes
- change_indexes
Indexes of the extended log where there was a change between cases
- pm4py.objects.log.util.get_prefixes.get_log_traces_to_activities(log, activities, parameters=None)[source]#
Get sublogs taking to each one of the specified activities
Parameters#
- log
Trace log object
- activities
List of activities in the log
- parameters
- Possible parameters of the algorithm, including:
PARAMETER_CONSTANT_ACTIVITY_KEY -> activity PARAMETER_CONSTANT_TIMESTAMP_KEY -> timestamp
Returns#
- list_logs
List of event logs taking to the first occurrence of each activity
- considered_activities
All activities that are effectively have been inserted in the list of logs (in some of them, the resulting log may be empty)
- pm4py.objects.log.util.get_prefixes.get_log_traces_until_activity(log, activity, parameters=None)[source]#
Gets a reduced version of the log containing, for each trace, only the events before a specified activity
Parameters#
- log
Trace log
- activity
Activity to reach
- parameters
- Possible parameters of the algorithm, including:
PARAMETER_CONSTANT_ACTIVITY_KEY -> activity PARAMETER_CONSTANT_TIMESTAMP_KEY -> timestamp
Returns#
- new_log
New log
pm4py.objects.log.util.index_attribute module#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
pm4py.objects.log.util.insert_classifier module#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
- pm4py.objects.log.util.insert_classifier.search_act_class_attr(log, force_activity_transition_insertion=False)[source]#
Search among classifiers expressed in the log one that is good for the process model extraction
Parameters#
- log
Trace log
- force_activity_transition_insertion
Optionally force the activitiy+transition classifier insertion
Returns#
- log
Trace log (plus eventually one additional event attribute as the classifier)
- pm4py.objects.log.util.insert_classifier.insert_activity_classifier_attribute(log, classifier, force_activity_transition_insertion=False)[source]#
Insert the specified classifier as additional event attribute in the log
Parameters#
- log
Trace log
- classifier
Event classifier
- force_activity_transition_insertion
Optionally force the activitiy+transition classifier insertion
Returns#
- log
Trace log (plus eventually one additional event attribute as the classifier)
- classifier_attr_key
Attribute name of the attribute that contains the classifier value
- pm4py.objects.log.util.insert_classifier.insert_trace_classifier_attribute(log, classifier)[source]#
Insert the specified classifier as additional trace attribute in the log
Parameter#
- log
Trace log
- classifier
Event classifier
Returns#
- log
Trace log (plus eventually one additional event attribute as the classifier)
- classifier_attr_key
Attribute name of the attribute that contains the classifier value
pm4py.objects.log.util.interval_lifecycle module#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
- class pm4py.objects.log.util.interval_lifecycle.Parameters(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#
Bases:
Enum
- TIMESTAMP_KEY = 'pm4py:param:timestamp_key'#
- START_TIMESTAMP_KEY = 'pm4py:param:start_timestamp_key'#
- TRANSITION_KEY = 'pm4py:param:transition_key'#
- ACTIVITY_KEY = 'pm4py:param:activity_key'#
- LIFECYCLE_INSTANCE_KEY = 'pm4py:param:lifecycle:instance:key'#
- BUSINESS_HOURS = 'business_hours'#
- BUSINESS_HOUR_SLOTS = 'business_hour_slots'#
- WORKCALENDAR = 'workcalendar'#
- pm4py.objects.log.util.interval_lifecycle.to_interval(log, parameters=None)[source]#
Converts a log to interval format (e.g. an event has two timestamps) from lifecycle format (an event has only a timestamp, and a transition lifecycle)
Parameters#
- log
Log (expressed in the lifecycle format)
- parameters
Possible parameters of the method (activity, timestamp key, start timestamp key, transition …)
Returns#
- log
Interval event log
- pm4py.objects.log.util.interval_lifecycle.to_lifecycle(log, parameters=None)[source]#
Converts a log from interval format (e.g. an event has two timestamps) to lifecycle format (an event has only a timestamp, and a transition lifecycle)
Parameters#
- log
Log (expressed in the interval format)
- parameters
Possible parameters of the method (activity, timestamp key, start timestamp key, transition …)
Returns#
- log
Lifecycle event log
pm4py.objects.log.util.log module#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
- pm4py.objects.log.util.log.get_event_labels(event_log, key)[source]#
Fetches the labels present in a log, given a key to use within the events.
Parameters#
- param event_log:
log to use
- param key:
to use for event identification, can for example be “concept:name”
Returns#
- return:
a list of labels
- pm4py.objects.log.util.log.get_event_labels_counted(event_log, key)[source]#
Fetches the labels (and their frequency) present in a log, given a key to use within the events.
Parameters#
- param event_log:
log to use
- param key:
to use for event identification, can for example be “concept:name”
Returns#
- return:
a list of labels
- pm4py.objects.log.util.log.get_trace_variants(event_log, key='concept:name')[source]#
Returns a pair of a list of (variants, dict[index -> trace]) where the index of a variant maps to all traces describing that variant, with that key.
Parameters#
- type key:
str
- param event_log:
log
- param key:
key to use to identify the label of an event
Returns#
- return:
- pm4py.objects.log.util.log.project_traces(event_log, keys='concept:name')[source]#
projects traces on a (set of) event attribute key(s). If the key provided is of type string, each trace is converted into a list of strings. If the key provided is a collection, each trace is converted into a list of (smaller) dicts of key value pairs
- Parameters:
event_log
keys (
str
)
- Returns:
pm4py.objects.log.util.log_regex module#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
- pm4py.objects.log.util.log_regex.get_encoded_trace(trace, mapping, parameters=None)[source]#
Gets the encoding of the provided trace
Parameters#
- trace
Trace of the event log
- mapping
Mapping (activity to symbol)
Returns#
- trace_str
Trace string
- pm4py.objects.log.util.log_regex.get_encoded_log(log, mapping, parameters=None)[source]#
Gets the encoding of the provided log
Parameters#
- log
Event log
- mapping
Mapping (activity to symbol)
Returns#
- list_str
List of encoded strings
- pm4py.objects.log.util.log_regex.form_encoding_dictio_from_log(log, parameters=None)[source]#
Forms the encoding dictionary from the current log
Parameters#
- log
Event log
- parameters
Parameters of the algorithm
Returns#
- encoding_dictio
Encoding dictionary
- pm4py.objects.log.util.log_regex.form_encoding_dictio_from_two_logs(log1: EventLog, log2: EventLog, parameters=None) Dict[str, str] [source]#
Forms the encoding dictionary from a couple of logs
Parameters#
- log1
First log
- log2
Second log
- parameters
Parameters of the algorithm
Returns#
- encoding_dictio
Encoding dictionary
pm4py.objects.log.util.move_attrs_to_trace module#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
- class pm4py.objects.log.util.move_attrs_to_trace.Parameters(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#
Bases:
Enum
- ENABLE_DEEPCOPY = 'enable_deepcopy'#
- pm4py.objects.log.util.move_attrs_to_trace.apply(log: EventLog, parameters: Dict[str | Parameters, Any] | None = None) EventLog [source]#
Moves the attributes that are constant for all the events of the trace, and they do not belong to a standard extension, to the trace level
Parameters#
- log
Event log
- parameters
Parameters of the algorithm, including: - Parameters.DEEPCOPY => enables the deepcopy of the event log
Returns#
- log
Event log, where some attribute has been possibly moved from the event to the trace level
pm4py.objects.log.util.pandas_log_wrapper module#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
- class pm4py.objects.log.util.pandas_log_wrapper.Parameters(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#
Bases:
Enum
- CASE_ID_KEY = 'pm4py:param:case_id_key'#
- CASE_ATTRIBUTE_PREFIX = 'case:'#
pm4py.objects.log.util.pandas_numpy_variants module#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
- class pm4py.objects.log.util.pandas_numpy_variants.Parameters(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#
Bases:
Enum
- CASE_ID_KEY = 'pm4py:param:case_id_key'#
- ACTIVITY_KEY = 'pm4py:param:activity_key'#
- TIMESTAMP_KEY = 'pm4py:param:timestamp_key'#
- INDEX_KEY = 'index_key'#
- pm4py.objects.log.util.pandas_numpy_variants.apply(dataframe: DataFrame, parameters=None) Tuple[Dict[Collection[str], int], Dict[str, Collection[str]]] [source]#
Efficient method returning the variants from a Pandas dataframe (through Numpy)
Minimum viable example:
import pandas as pd import pm4py from pm4py.objects.log.util import pandas_numpy_variants
dataframe = pd.read_csv(‘tests/input_data/receipt.csv’) dataframe = pm4py.format_dataframe(dataframe) variants_dict, case_variant = pandas_numpy_variants.apply(dataframe)
Parameters#
- dataframe
Dataframe
- parameters
Parameters of the algorithm, including: - Parameters.CASE_ID_KEY => the case identifier - Parameters.ACTIVITY_KEY => the activity - Parameters.TIMESTAMP_KEY => the timestamp - Parameters.INDEX_KEY => the index
Returns#
- variants_dict
Dictionary associating to each variant the number of occurrences in the dataframe
- case_variant
Dictionary associating to each case identifier the corresponding variant
pm4py.objects.log.util.sampling module#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
- pm4py.objects.log.util.sampling.sample_stream(event_log, no_events=100)[source]#
Randomly sample a fixed number of events from the original event log
Parameters#
- event_log
Event log
- no_events
Number of events that the sample should have
Returns#
- newLog
Filtered log
pm4py.objects.log.util.sorting module#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
- pm4py.objects.log.util.sorting.sort_timestamp_trace(trace, timestamp_key='time:timestamp', reverse_sort=False)[source]#
Sort a trace based on timestamp key
Parameters#
- trace
Trace
- timestamp_key
Timestamp key
- reverse_sort
If true, reverses the direction in which the sort is done (ascending)
Returns#
- trace
Sorted trace
- pm4py.objects.log.util.sorting.sort_timestamp_stream(event_log, timestamp_key='time:timestamp', reverse_sort=False)[source]#
Sort an event log based on timestamp key
Parameters#
- event_log
Event log
- timestamp_key
Timestamp key
- reverse_sort
If true, reverses the direction in which the sort is done (ascending)
Returns#
- event_log
Sorted event log
- pm4py.objects.log.util.sorting.sort_timestamp_log(event_log, timestamp_key='time:timestamp', reverse_sort=False)[source]#
Sort a log based on timestamp key
Parameters#
- event_log
Log
- timestamp_key
Timestamp key
- reverse_sort
If true, reverses the direction in which the sort is done (ascending)
Returns#
- log
Sorted log
- pm4py.objects.log.util.sorting.sort_timestamp(log, timestamp_key='time:timestamp', reverse_sort=False)[source]#
Sort a log based on timestamp key
Parameters#
- log
Trace/Event log
- timestamp_key
Timestamp key
- reverse_sort
If true, reverses the direction in which the sort is done (ascending)
Returns#
- log
Sorted Trace/Event log
- pm4py.objects.log.util.sorting.sort_lambda_log(event_log, sort_function, reverse=False)[source]#
Sort a log based on a lambda expression
Parameters#
- event_log
Log
- sort_function
Sort function
- reverse
Boolean (sort by reverse order)
Returns#
- new_log
Sorted log
pm4py.objects.log.util.split_train_test module#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
- pm4py.objects.log.util.split_train_test.split(log: EventLog, train_percentage: float = 0.8) Tuple[EventLog, EventLog] [source]#
Split an event log in a training log and a test log (for machine learning purposes)
Parameters#
- log
Event log
- train_percentage
Fraction of traces to be included in the training log (from 0.0 to 1.0)
Returns#
- training_log
Training event log
- test_log
Test event log
pm4py.objects.log.util.xes module#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions