pm4py.algo.filtering.pandas.start_activities package#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
Submodules#
pm4py.algo.filtering.pandas.start_activities.start_activities_filter module#
PM4Py – A Process Mining Library for Python
Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.
Website: https://processintelligence.solutions Contact: info@processintelligence.solutions
- class pm4py.algo.filtering.pandas.start_activities.start_activities_filter.Parameters(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#
Bases:
Enum
- CASE_ID_KEY = 'pm4py:param:case_id_key'#
- ACTIVITY_KEY = 'pm4py:param:activity_key'#
- DECREASING_FACTOR = 'decreasingFactor'#
- GROUP_DATAFRAME = 'grouped_dataframe'#
- POSITIVE = 'positive'#
- pm4py.algo.filtering.pandas.start_activities.start_activities_filter.apply(df: DataFrame, values: List[str], parameters: Dict[str | Parameters, Any] | None = None) DataFrame [source]#
Filter dataframe on start activities
Parameters#
- df
Dataframe
- values
Values to filter on
- parameters
- Possible parameters of the algorithm, including:
Parameters.CASE_ID_KEY -> Case ID column in the dataframe Parameters.ACTIVITY_KEY -> Column that represents the activity Parameters.POSITIVE -> Specifies if the filtered should be applied including traces (positive=True) or excluding traces (positive=False)
Returns#
- df
Filtered dataframe
- pm4py.algo.filtering.pandas.start_activities.start_activities_filter.filter_df_on_start_activities(df, values, case_id_glue='case:concept:name', activity_key='concept:name', grouped_df=None, positive=True)[source]#
Filter dataframe on start activities
Parameters#
- df
Dataframe
- values
Values to filter on
- case_id_glue
Case ID column in the dataframe
- activity_key
Column that represent the activity
- grouped_df
Grouped dataframe
- positive
Specifies if the filtered should be applied including traces (positive=True) or excluding traces (positive=False)
Returns#
- df
Filtered dataframe
- pm4py.algo.filtering.pandas.start_activities.start_activities_filter.filter_df_on_start_activities_nocc(df, nocc, sa_count0=None, case_id_glue='case:concept:name', activity_key='concept:name', grouped_df=None)[source]#
Filter dataframe on start activities number of occurrences
Parameters#
- df
Dataframe
- nocc
Minimum number of occurrences of the start activity
- sa_count0
(if provided) Dictionary that associates each start activity with its count
- case_id_glue
Column that contains the Case ID
- activity_key
Column that contains the activity
- grouped_df
Grouped dataframe
Returns#
- df
Filtered dataframe