Pandas Weighted Sum / Sumproduct

Creating example data:

import pandas as pd

d = {
    "id" : [1, 1, 1, 1, 1],
    "value" : [12313, 413251, 41241, 12929, 5325],
    "R1" : [0.01, 0.15, 0.155, 0.234, 0.0003],
    "R2" : [0, 0.1, 0.009, 0.02, 0.001]
}

df1 = pd.DataFrame(d)

df2 = df1.copy()
df2['id'] = 2

df = df1.append(df2)

which is:

id  value   R1  R2
0   1   12313   0.0100  0.000
1   1   413251  0.1500  0.100
2   1   41241   0.1550  0.009
3   1   12929   0.2340  0.020
4   1   5325    0.0003  0.001
0   2   12313   0.0100  0.000
1   2   413251  0.1500  0.100
2   2   41241   0.1550  0.009
3   2   12929   0.2340  0.020
4   2   5325    0.0003  0.001

Want to calculate:

For id 1:

sumproduct[value, R1] / sum(value)

For id 2:

sumproduct(value, R1) / sum(value)

(and so on for N ids)

In this case since numbers are the same we get:

id   frac1
1    0.147
2    0.147

but should work for numbers being different too. And in the real problem I have more than just 2 ids.

And I want to do the same for R2 and more columns, but always from “value”, so;

For id 1:

sumproduct[value, R2] / sum(value)

For id 2:

sumproduct(value, R2) / sum(value)

I’ve been trying some kind of df.groupby("id").prod() but can’t make it flexible enough / and to work

Answer

One easy way is group().apply():

cols = ['R1','R2']
(df.groupby('id')
   .apply(lambda x: x[cols].mul(x['value'],axis=0)
                          .sum()/x['value'].sum()
         )
)

Output:

          R1        R2
id                    
1   0.147467  0.086505
2   0.147467  0.086505

Leave a Reply

Your email address will not be published. Required fields are marked *