pandas datetime index unique difference

The following works for getting unique difference in consecutive datetime index.

# Data
import pandas
d = pandas.DataFrame({"a": [x for x in range(5)]})
d.index = pandas.date_range("2021-01-01 00:00:00", "2021-01-01 01:00:00", freq="15min")

# Get difference
delta = d.index.to_series().diff().astype("timedelta64[m]").unique()
delta
# array([nan, 15.])

But I am not clear where the nan comes from. I am only interested in the 15 minutes. Is delta[1] a reliable way to get it or am I missing something?

Answer

The first row doesn’t have anything to diff against, so its NaT.

>>> d.index.to_series().diff()
2021-01-01 00:00:00        NaT
2021-01-01 00:15:00   00:15:00
2021-01-01 00:30:00   00:15:00
2021-01-01 00:45:00   00:15:00
2021-01-01 01:00:00   00:15:00
Freq: 15T, dtype: timedelta64[ns]

From pandas.Series.unique: Uniques are returned in order of appearance.. Since that NaT is guaranteed to be the first element in the returned list it is okay to do delta[1] as you suggest. Assuming you have at least 2 rows and you don’t have NaT in the data.

More generally, if you don’t want that first value in a diff, you can slice it off

>>> d.index.to_series().diff()[1:]
2021-01-01 00:15:00   00:15:00
2021-01-01 00:30:00   00:15:00
2021-01-01 00:45:00   00:15:00
2021-01-01 01:00:00   00:15:00
Freq: 15T, dtype: timedelta64[ns]