pm4py.objects.log.util.pl_lazy_fea_utils module#

PM4Py – A Process Mining Library for Python

Copyright (C) 2024 Process Intelligence Solutions UG (haftungsbeschränkt)

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see this software project’s root or visit <https://www.gnu.org/licenses/>.

Website: https://processintelligence.solutions Contact: info@processintelligence.solutions

pm4py.objects.log.util.pl_lazy_fea_utils.automatic_feature_selection_df(df: polars.LazyFrame, parameters: Dict[Any, Any] | None = None) polars.LazyFrame[source]#

Selects useful features from a Polars lazyframe for ML purposes.

pm4py.objects.log.util.pl_lazy_fea_utils.select_number_column(df: polars.LazyFrame, fea_df: polars.LazyFrame, col: str, case_id_key: str = 'case:concept:name') polars.LazyFrame[source]#

Adds a numeric column to the feature lazyframe.

Notes on column duplication:
  • If fea_df already contained col (e.g., repeated calls / duplicate inputs), Polars would create col_right during the join. We explicitly drop any prior versions first to keep the output schema stable.

  • We also ensure the internal row-number column does not collide with user data.

pm4py.objects.log.util.pl_lazy_fea_utils.select_string_column(df: polars.LazyFrame, fea_df: polars.LazyFrame, col: str, case_id_key: str = 'case:concept:name', count_occurrences: bool = False) polars.LazyFrame[source]#

Adds one-hot or count encoded columns for a categorical attribute.

pm4py.objects.log.util.pl_lazy_fea_utils.select_string_columns(df: polars.LazyFrame, fea_df: polars.LazyFrame, columns: List[str], case_id_key: str = 'case:concept:name', count_occurrences: bool = False) polars.LazyFrame[source]#

Adds one-hot or count encoded columns for the provided categorical attributes.

pm4py.objects.log.util.pl_lazy_fea_utils.get_features_df(df: polars.LazyFrame, list_columns: List[str], parameters: Dict[Any, Any] | None = None) polars.LazyFrame[source]#

Performs automatic feature extraction on a Polars LazyFrame.

pm4py.objects.log.util.pl_lazy_fea_utils.automatic_feature_extraction_df(df: polars.LazyFrame, parameters: Dict[Any, Any] | None = None) polars.LazyFrame[source]#

Wrapper that performs automatic feature extraction on a Polars lazyframe.