I have a dataset of property prices and they are currently listed by ‘DATE_SOLD’. I’d like to be able to count them by year. The dataset looks like this –
SALE_DATE COUNTY SALE_PRICE 0 2010-01-01 Dublin 343000.0 1 2010-01-03 Laois 185000.0 2 2010-01-04 Dublin 438500.0 3 2010-01-04 Meath 400000.0 4 2010-01-04 Kilkenny 160000.0
This is the code I’ve tried –
by_year = property_prices['SALE_DATE'] = pd.to_datetime(property_prices['SALE_DATE']) print(by_year)
I think I’m close but as a biblical noob it’s quite frustrating!
Thank you for any help you can provide; this site has been awesome so far in finding little tips and tricks to make my life easier
You are close. As you did, you can use
pd.to_datetime to convert your sale_date to a datetime column. Then
groupby the year, using
dt.year which gets the year of the datetime, and use
size() on that which computes the size of each group, which in this case is the year.
property_prices['SALE_DATE'] = pd.to_datetime(property_prices['SALE_DATE']) property_prices.groupby(property_prices.SALE_DATE.dt.year).size()
SALE_DATE 2010 5 dtype: int64