I have lists inside list where I want to create new column in my dataframe for each list: my input can be seen below:
datalist : [[abc1, abc2, abc3, abc4, abc5],[kh1, kh2, kh3, kh4],[jpor1, jpor2, jpor3, jpor4, jpor5]]
each item in my datalist is a column title in my existing dataframe while index is also a column in my dataframe which contain the number where I need to extract the value for my new column
so I want my output to be something like this:
index abc1 abc2 abc3 abc4 abc5 abc_result 4 87 94 34 28 43 28 2 87 94 34 28 43 94 5 87 94 34 28 43 43 4 87 94 34 28 43 28 1 87 94 34 28 43 87
because I have 3 list inside my datalist, I want to have 3 new colums created and added to the dataframe, which are abc_result, kh_result, jpor_result (all based from the index column). I am really confused as I feel i need to make a new list in for everylist in my datalist while string formatting the new column title?
so basically the new column value is based on the index column’s value. If the value is 1 I want to extract the value from abc1 for abc_result, if 2 then extract the value from abc2, etc. Then another new column for kh_result which also need the value from the kh1/kh2/kh3 based on the index column value.
Thanks for any answer!
You can try this:
import pandas as pd import numpy as np datalist = [ ["abc1", "abc2", "abc3", "abc4", "abc5"], ["kh1", "kh2", "kh3", "kh4"], ["jpor1", "jpor2", "jpor3", "jpor4", "jpor5"] ] flat_list = [item for sublist in datalist for item in sublist] column_groups= set([item[:-1] for sublist in datalist for item in sublist])
For row-wise operation of getting your desired results, you need to create a function:
def grabber(row,key:str): return(row[key + str(row['index'])])
This function grabs the value based on the index and the key of column group indicator.
All is remained is to iterate over the column group keys and generate the results. Lets assume that your data is already loaded in
df = <load the data> for key in column_groups: df[key + '_result'] = df.apply(lambda x: grabber(x,key), axis=1)
However, this code works if your items only have their last character as number. Otherwise you need to use longest match algorithms for each sublist which you can find here.