I have a dataframe of URLs and I want to remove the ‘www.’ and ‘.com’ out of it. The ‘.com’ might also be ‘.org’, ‘.net’ etc. I was thinking something like the below might work but need some support getting a working script.
for i in x: # x = single column dataframe of URLs if i.endswith('.com'): x = i[:-4] if i.startswith('www.'): x = i[4:] x
You might be able to use
str.replace with an appropriate regular expression:
df["url"] = df["url"].str.replace(r'^www.|.(?:com|org|net)$', '')