pm4py.utils.format_dataframe#
- pm4py.utils.format_dataframe(df: DataFrame, case_id: str = 'case:concept:name', activity_key: str = 'concept:name', timestamp_key: str = 'time:timestamp', start_timestamp_key: str = 'start_timestamp', timest_format: str | None = None) DataFrame [source]#
Give the appropriate format on the dataframe, for process mining purposes
- Parameters:
df (
DataFrame
) – Dataframecase_id (
str
) – Case identifier columnactivity_key (
str
) – Activity columntimestamp_key (
str
) – Timestamp columnstart_timestamp_key (
str
) – Start timestamp columntimest_format – Timestamp format that is provided to Pandas
- Return type:
pd.DataFrame
import pandas as pd import pm4py dataframe = pd.read_csv('event_log.csv') dataframe = pm4py.format_dataframe(dataframe, case_id_key='case:concept:name', activity_key='concept:name', timestamp_key='time:timestamp', start_timestamp_key='start_timestamp', timest_format='%Y-%m-%d %H:%M:%S')