pm4py.utils.format_dataframe#
- pm4py.utils.format_dataframe(df: DataFrame, case_id: str = 'case:concept:name', activity_key: str = 'concept:name', timestamp_key: str = 'time:timestamp', start_timestamp_key: str = 'start_timestamp', timest_format: str | None = None) DataFrame [source]#
Formats the dataframe appropriately for process mining purposes.
- Parameters:
df (
DataFrame
) – Dataframe.case_id (
str
) – Case identifier column.activity_key (
str
) – Activity column.timestamp_key (
str
) – Timestamp column.start_timestamp_key (
str
) – Start timestamp column.timest_format – Timestamp format provided to Pandas.
- Returns:
A formatted pandas DataFrame.
- Return type:
pd.DataFrame
import pandas as pd import pm4py dataframe = pd.read_csv('event_log.csv') dataframe = pm4py.format_dataframe( dataframe, case_id='case:concept:name', activity_key='concept:name', timestamp_key='time:timestamp', start_timestamp_key='start_timestamp', timest_format='%Y-%m-%d %H:%M:%S' )