I have a data frame like 1 and I am trying to create a new data frame 2 which consists of ratios of each column of above data frame.
I tried below mentioned logic.
df_new = pd.concat([df[df.columns.difference([col])].div(df[col], axis=0) .add_suffix('/R') for col in df.columns], axis=1)
Output is:
B/R C/R D/R A/R C/R D/R A/R B/R D/R A/R B/R C/R 0 0.46 1.16 0.78 2.16 2.50 1.69 0.86 0.40 0.68 1.28 0.59 1.48 1 1.05 1.25 1.64 0.95 1.19 1.55 0.80 0.84 1.30 0.61 0.64 0.77 2 1.56 2.78 2.78 0.64 1.79 1.79 0.36 0.56 1.00 0.36 0.56 1.00 3 0.54 2.23 0.35 1.86 4.14 0.64 0.45 0.24 0.16 2.89 1.56 6.44
However, here I am facing two issues. One is I am getting both A/B and B/A which are not needed and also increases number of columns. Is there a way to get the output only A/B and eliminate/restrict B/A.
Second issue is with Naming of columns using add suffix method which does not convey which is divided by which. Is there a way to create column names like A/B for Column A divided by column B.
Answer
Use combinations
with divide columns in list comprehension:
df = pd.DataFrame({ 'A':[5,3,6,9,2,4], 'B':[4,5,4,5,5,4], 'C':[7,8,9,4,2,3], 'D':[1,3,5,7,1,8], }) from itertools import combinations L = {f'{a}/{b}': df[a].div(df[b]) for a, b in combinations(df.columns, 2)} df = pd.concat(L, axis=1) print (df) A/B A/C A/D B/C B/D C/D 0 1.25 0.714286 5.000000 0.571429 4.000000 7.000000 1 0.60 0.375000 1.000000 0.625000 1.666667 2.666667 2 1.50 0.666667 1.200000 0.444444 0.800000 1.800000 3 1.80 2.250000 1.285714 1.250000 0.714286 0.571429 4 0.40 1.000000 2.000000 2.500000 5.000000 2.000000 5 1.00 1.333333 0.500000 1.333333 0.500000 0.375000