I am in the process of doing some data wrangling and I have come across the following problem.
In my CSV file (that I read through pandas) I have a column called effects. effects look like this:
effects Row 1: Euphoric 52% Creative 42% Happy 40% ... Row 2: Relaxed 95% Happy 90% Euphoric 60% ... Row 3: Happy 64% Uplifted 48% Relaxed 48% ...
and so on and so forth. now there is another column called name that looks like this:
name Row 1: Tommy Row 2: Susan Row 3: Bruce
What I am trying to do is map each key/value in the effects column to a unique column that corresponds to their appropriate name.
So essentially it would look like this:
name euphoric creative happy Row 1: Tommy 52% 42% 40% Row 2: Susan 60% n/a 90% Row 3: Bruce n/a n/a 64%
So you can see each effect with its corresponding percentage becomes a unique column for every name (and if the data is not present it’s n/a or null).
I am assuming the answer may include converting the effects object into a list of dictionaries and perhaps some excel text to column action on top to make the mapping easier but I haven’t been able to do it successfully.
any help here would be immensely appreciated.
Answer
You can use str.extract
like this:
df[['name']].join(df.effects.str.extractall('(w+) (d+%)') .reset_index() .pivot(index='level_0', columns=0, values=1) )
Output:
name Creative Euphoric Happy Relaxed Uplifted Row 1: Tommy 42% 52% 40% NaN NaN Row 2: Susan NaN 60% 90% 95% NaN Row 3: Bruce NaN NaN 64% 48% 48%