Issue with removing duplicates in pandas dataframe

I have this dataframe:

  Ubicacion       lat       lon
0         a  19.28034 -99.17121
1         b  19.28333 -99.17535
2         c  19.28028 -99.16887
3         a  19.28034 -99.17121
4         b  19.28333 -99.17535
5         c  19.28028 -99.16887
6         b  19.28333 -99.17535
7         d  19.29259 -99.17757
8         d  19.29259 -99.17757
9         d  19.29259 -99.17757

And I want to remove all duplicate rows, so I use:

ubicaciones_finales = ubicaciones_finales.drop_duplicates(keep="first")

And I get this:

  Ubicacion       lat       lon
0         a  19.28034 -99.17121
1         b  19.28333 -99.17535
2         c  19.28028 -99.16887
7         d  19.29259 -99.17757

Everything seems fine except that rows go 0, 1, 2 and then 7. So when I run:

 for k, row in ubicaciones_finales.iterrows():
    print(k)

I get:
0
1
2
7

How do I solve this? btw, already check pandas documentation

df.drop_duplicates()
    brand style  rating
0  Yum Yum   cup     4.0
2  Indomie   cup     3.5
3  Indomie  pack    15.0
4  Indomie  pack     5.0

And its the same, it goes from 0 to 2 witouth 1. Thank you for your time.

Answer

IIUC, go with reset_index or simply pass ignore_index=True:

df = df.drop_duplicates(keep='first').reset_index(drop=True)

# or 

df = df.drop_duplicates(keep='first', ignore_index=True)

Output:

  Ubicacion       lat       lon
0         a  19.28034 -99.17121
1         b  19.28333 -99.17535
2         c  19.28028 -99.16887
3         d  19.29259 -99.17757

Leave a Reply

Your email address will not be published. Required fields are marked *