how to create a dataframe directly from groupby

My code beneath works fine. But… I think there is a more efficient way of coding this. But I can’t figure it out. I tought reset_index() worked well, but it doesn’t in this case. So, all suggestions are welcome. Thanks in advance!

I have a large dataframe (hospital data). All data are from 2017, 2018 and 2019. The column: spoedelectief can have two values: one for emergency and one for non emergency patient. In Dutch emergency is called Spoed. So, emergency is S and non emergency is E.

From the dataframe I want to make ( to visualize the amount of emergency and non emergency each year) a new dataframe. But I’m stuck with that. Some code;

test = df_new.groupby(df_new['operatiejaar'])['spoedelectief'].value_counts().sort_index()

gives back a Pandas Series:

operatiejaar  spoedelectief
2017          E                5459
              S                1054
2018          E                6191
              S                1029
2019          E                6160
              S                1159

For visualisation in Seaborn I tried to make this a DataFrame with reset_index() but that gives an error:

ValueError: cannot insert spoedelectief, already exists

Making test a DataFrame works:

test = pd.DataFrame(test)

With this result:

enter image description here

But test.columns gives this:

Index(['spoedelectief'], dtype='object')

Underneath the code I used to create a DataFrame as I wanted:

test = df_new.groupby(df_new['operatiejaar'])['spoedelectief'].value_counts().sort_index()

jaar_list = []
spel_list = []
totaal = []
for index, value in test.items():
    jaar_list.append(index[0])
    spel_list.append(index[1])
    totaal.append(value)

spel_jaar = pd.DataFrame(
    {'jaar': jaar_list,
     'spoedelectief': spel_list,
     'totaal': totaal
    })

Wich gives the desired DF:

enter image description here

How to code this much easier / directly from the original DF? thanks!

Answer

You need rename Series before Series.reset_index:

test = (df_new.groupby(df_new['operatiejaar'])['spoedelectief']
              .value_counts()
              .rename('count')
              .sort_index()
              .reset_index())

Or use name in Series.reset_index:

test = (df_new.groupby(df_new['operatiejaar'])['spoedelectief']
              .value_counts()
              .sort_index()
              .reset_index(name='count'))