Merge rows based on date range

I have a pandas df with hundreds of columns and thousands of rows. Here are the 3 columns that interest us:

ID startDate endDate
123 2020-01-01 2020-01-25
123 2020-01-26 2020-02-08
123 2020-02-09 2020-03-12

I want for each row with the same ID, merge the rows if the dates follow each others, and keep all other columns intact.

For our example, the output would be a single row because the dates follow:

ID startDate endDate
123 2020-01-01 2020-03-12

Do you have an idea on how to do it with pandas?

Answer

If datetimes are not sorted or not sure use min and max for aggregation:

df.groupby('ID', as_index=False).agg({'startDate': 'min', 'endDate': 'max'})

If there is a lot another columns and need aggregate only 2 columns:

df['startDate'] = df.groupby('ID')['startDate'].transform('min')
df['endDate'] = df.groupby('ID')['endDate'].transform('max')

df = df.drop_duplicates('ID')