My DataFrame has two columns:
Name Status a I am Good b Goodness!!! c Good is what i feel d Not Good-at-all
I want to filter rows in which Status has a string ‘Good’ as its exact word, not combined with any other words or characters.
So the output will be:
Name Status a i am Good c Good is what i feel
Two other rows had a ‘Good’ string in it but mixed with other characters, so should not be picked up.
I tried doing:
d = df[df['Status'].str.contains('Good')] # But all rows come up
I believe some regex like
(r'bGoodb', Status) will do that, but this is not able to sum it up together. And how/where exactly can I fit the regex in a DataFrame filter condition to achieve this? And how to achieve
endswith ‘Good’ (exact word search)?
If you’re defining “exact” to mean no other characters (including punctuation which defines a word boundary
b), you could instead check for a leading and trailing space and/or beginning/end anchors:
>>> df[df['Status'].str.contains(r'(?:s|^)Good(?:s|$)')] Name Status 0 a I am Good 2 c Good is what i feel
(?:s|^)is a non-capturing group looking for a space character (
s) or the beginning of the string (
Goodis the word you’re looking for.
(?:s|$)is a non-capturing group looking for a space character (
s) or the end of the string (