I am trying to look for the words that are not in common between two pandas columns that contain lists.
The words are not always in the same order and the length of the list can vary.
As an example
column1 column2 ['a','b'] ['c','a','b'] ['c','a'] ['a','b','d','c']
the result I want is
column3 ['c'] ['b','d']
Thank you in advance!
As your target is to look for words that are not in common between the 2 pandas columns, I suppose you also want to find the uncommon elements when
column1 element list is a superset of
column2 list and vice versa.
Unfortunately, the 2 existing solutions doesn’t handle for this case, e.g.
column1 column2 0 [c, a, b] [a, b] 1 [c, a] [a, b, d, c]
Both the other solutions give result in
column1 column2 column3 0 [c, a, b] [a, b]  <== empty list  instead of ['c'] 1 [c, a] [a, b, d, c] [b, d]
If you want the result above to show
['c'] instead of
 for the first row, you can do it this way:
symmetric_difference() function instead:
df['column3'] = df.apply(lambda x: list(set(x['column1']).symmetric_difference(set(x['column2']))), axis=1)
print(df) column1 column2 column3 0 [c, a, b] [a, b] [c] 1 [c, a] [a, b, d, c] [b, d]