pandas.Series.str.contains couldn’t detect “[a-zA-Z]”

I have been trying to remove the rows that contain alphabet characters as this one.


And what I tried is

df = df[~df['tag_name'].str.contains("[a-zA-Z]")]

I did remove some rows in this way but some rows remained like this. And I found these characters look different from those I type in.



Could it be something wrong with the encoding method? And does anyone know how I can remove these rows?


Or you could try unicodedata.normalize:

import unicodedata
df = df[~df['tag_name'].apply(lambda x: unicodedata.normalize('NFKD', x)).str.contains("[a-zA-Z]")]