Replace values in a slice of columns in a pandas dataframe with a value based on a condition

I have a large Pandas dataframe, and want to replace some values in a subset of the columns based on a condition.

Specifically, I want to replace the values that are greater than one with 1 in every column to the right of the 9th column.

Because the dataframe is so large and growing in both the number of rows and columns over time, I cannot manually specify the names of the columns to change values in. Rather, I just need to specify that column 10 and greater should be inspected for values > 1.

After looking at many different Stack Overflow posts and Pandas documentation, I tried:

df.iloc[df[:,10: ] > 1] = 1

However, this gives me the error “unhashable type: ‘slice’”.

I then tried:

df[df.iloc[:, 10:] > 1] = 1

and

df[df.loc[:, df.columns[10:]] > 1] = 1

as per 2 suggestions in the comments, but both of those give me the error “Cannot do inplace boolean setting on mixed-types with a non np.nan value”.

Does anyone know why I’m getting these errors and/or what I should change about my code to avoid them?

Thank you!

Answer

1. DataFrame.where

We can use iloc to select all the columns to the right of 9th column, then using where we can replace the values in the slice of dataframe where the condition x.le(1) is False.

df.iloc[:, 10:] = df.iloc[:, 10:].where(lambda x: x.le(1), 1)

2. DataFrame.clip

Alternatively we can use clip where we can define the upper limit as 1 which assigns all the values greater than 1 in the slice of dataframe to 1.

df.iloc[:, 10:] = df.iloc[:, 10:].clip(upper=1)