Operating on inner level multi-index columns

Suppose I have a dataframe of multi-index columns,

       TSLA                                MSFT                   
Year   revenues other_revenues expenses    revenues other_revenues   expenses
2019        851             10      110         200             13        213
2018        725             11      111         150             14        214

How can I add the inner columns to obtain

       TSLA                                    MSFT                   
Year   revenues other_revenues expenses  sum    revenues other_revenues   expenses  sum
2019        851             10      110  971        200             13        213   426
2018        725             11      111  847        150             14        214   378

Secondly, in general what are the common functions I should note when working with multi-index columns? Is there a way of thinking when working with multi-index columns? I’m comfortable thinking in normal (single leveled index) but not used to multi-index. Thanks!

Answer

First create new DataFrame filled by sums and MultiIndex to df1:

sub = ['revenues', 'other_revenues', 'expenses']


df1 = df.sum(level=0, axis=1)
df1.columns = pd.MultiIndex.from_product([df1.columns, ['sum']])

Then use concat for join together:

df = pd.concat([df, df1], axis=1)

And lasr for custom order is added reindex:

mux = pd.MultiIndex.from_product([df.columns.levels[0], sub + ['sum']])
df = df.reindex(mux, axis=1)
print (df)
         MSFT                                  TSLA                          
     revenues other_revenues expenses  sum revenues other_revenues expenses   
Year                                                                          
2019      200             13      213  426      851             10      110   
2018      150             14      214  378      725             11      111   

           
      sum  
Year       
2019  971  
2018  847  

EDIT: You can seelct with slicers (but I think here is necessary sorted MultiIndex):

idx = pd.IndexSlice
print (df.loc[:, idx[:, ['revenues','other_revenues']]])
         TSLA     MSFT           TSLA           MSFT
     revenues revenues other_revenues other_revenues
2019      851      200             10             13
2018      725      150             11             14

# df.index.name = 'Year'
sub = ['revenues', 'other_revenues', 'expenses']


df1 = df.loc[:, idx[:, ['revenues','other_revenues']]].sum(level=0, axis=1)
df1.columns = pd.MultiIndex.from_product([df1.columns, ['sum']])
df = pd.concat([df, df1], axis=1)

mux = pd.MultiIndex.from_product([df.columns.levels[0], sub + ['sum']])
df = df.reindex(mux, axis=1)
print (df)
         MSFT                                  TSLA                          
     revenues other_revenues expenses  sum revenues other_revenues expenses   
2019      200             13      213  213      851             10      110   
2018      150             14      214  164      725             11      111   

           
      sum  
2019  861  
2018  736  

Leave a Reply

Your email address will not be published. Required fields are marked *