How to group and aggregate different dataframes in pandas

df1

A B
a 1
a 1
a 4
b 1 
b 3

df2

A B
a 1
a 2
c 3 
c 5

df1.groupby("A").size()

a 3
b 2

df2.groupby("A").size()

a 2
c 2

I’d like to get following sizeaggregation

   df1 df2
a  3    2
b  2    0 
c  0    2  

Are there any way to achieve this? I’d like to know aggregation method.

If someone has opinion,please let me know. Thanks

Answer

  1. You can use pd.concat on the two grouped dataframes and pass axis=1 (This is essentailly an outer join with pd.merge, but the syntax is a bit more concise).
  2. Then, just do some cleanup with .fillna(0), rename columns as desired with .rename() and use .astype(int) to make the columns data types integers:

df3 = (pd.concat([df1.groupby("A").size(), df2.groupby("A").size()], axis=1)
      .fillna(0).rename({0 : 'df1', 1 : 'df2'}, axis=1).astype(int))
df3
Out[1]: 
   df1  df2
a    3    2
b    2    0
c    0    2