Pandas transform dataframe to get column names based on row condition

I have a dataframe containing cars and the speed in a certain zone.

I want to transform the dataframe so I know which car had a speed bigger than 100 for every zone, so I obtain a row for every car and two columns, differentiating the zones based on the 100 value.

Input:

car id  speed_t1  speed_t2 speed_t3 speed_t4
1. 90 80 120 34
2. 110 130 140 99
3. 40 110 20 110

Expected output:

car id normal_speed_t high_speed_t
1. speed_t1,speed_t2,speed_t4 speed_t3
2. speed_t4 speed_t1,speed_t2,speed_t3
3. speed_t1,speed_t3 speed_t2,speed_t4

Answer

Just use DataFrame.apply on axis=1, then join the columns by , which has the value less than or greater than the given value.

df.assign(normal_speed_t=df.apply(lambda x:','.join(x[x<100].index), axis=1),
          high_speed_t=df.apply(lambda x:','.join(x[x>=100].index), axis=1)
          )

OUTPUT:

        speed_t1  speed_t2  speed_t3  speed_t4              normal_speed_t                high_speed_t
car id                                                                                                 
1.0           90        80       120        34  speed_t1,speed_t2,speed_t4                    speed_t3 
2.0          110       130       140        99                    speed_t4  speed_t1,speed_t2,speed_t3 
3.0           40       110        20       110           speed_t1,speed_t3           speed_t2,speed_t4

Break-Down:

  • assign just lets you assign a new column with given values
  • .apply allows you to apply some function to the dataframes columnwise for axis=0, and row-wise for axis=1
  • x[x<100].index will filter the values that are less than 100, and get the index i.e. column index/column names
  • ','.join(....) it is joining the columns that are coming from above step