Removing values in two columns of a Pandas dataframe if row above have the same values

With this sample pandas df:

ColA   ColB   ColC
Apple  Fruit  Food
Apple  Fruit  Pie
Apple  Arrow  Story

I am attempting to roll through the dataframe and if the values in ColA and ColB are the same in the current row as in the previous row, delete the current rows values for those two columns only.

The expected result would be:

ColA   ColB   ColC
Apple  Fruit  Food
              Pie
Apple  Arrow  Story

I attempted various loops with iloc grabbing the current rows values for those two columns, storing them in a variable and then checking if the subsequent row is identical. However, on my 5 row test data, kept getting error that the list index was out of range. Stumped at this point.

Answer

Try:

df[["ColA", "ColB"]] = df[["ColA", "ColB"]].where(~df.duplicated(["ColA", "ColB"]), "")

>>> df
    ColA   ColB   ColC
0  Apple  Fruit   Food
1                  Pie
2  Apple  Arrow  Story

If your data is not sorted by ColA and ColB and you want to only delete duplicates in consecutive rows, use:

df[["ColA", "ColB"]] = df[["ColA", "ColB"]].where(df[["ColA", "ColB"]].ne(df[["ColA", "ColB"]].shift()).any(1), "")