I have a Pandas dataframe similar to the following:
valid Measurement Room 2014-02-03 12:48 0.50 23 2014-02-03 12:53 0.43 23 2014-02-03 12:59 0.21 23 2014-02-03 13:06 0.23 23 2014-02-03 13:13 0.10 23 ...
I am trying to read in these dates. They are currently strings, but I want to read them as date time
; however, that isn’t working out so well.
def hourlyDataSet(fp): df = pd.read_csv(fp)#data frame df[['day', 'time']] = df['valid'].str.split().apply(pd.Series) mat='%Y-%m-%d %H:%M' df['datetime'] = pd.to_datetime(df['time'],format = mat) newdf = df.groupby(pd.Grouper(key = "datetime",freq= "H")).sum() return df
Using the above function, I am receiving this error:
ValueError: time data '12:48' does not match format '%Y-%m-%d%H:%M' (match)
How can I fix this?
Answer
- If you split
'valid'
, thenmat
does not match the format of'time'
- The function should be as follows
def hourlyDataSet(fp): # read the data file df = pd.read_csv(fp) # convert the valid column, to a datetime format df['valid'] = pd.to_datetime(df['valid'], format='%Y-%m-%d %H:%M') # use .Grouper on the datetime column newdf = df.groupby(pd.Grouper(key="valid", freq="H")).sum() return newdf