I have a large Pandas dataframe, and want to replace some values in a subset of the columns based on a condition.
Specifically, I want to replace the values that are greater than one with 1 in every column to the right of the 9th column.
Because the dataframe is so large and growing in both the number of rows and columns over time, I cannot manually specify the names of the columns to change values in. Rather, I just need to specify that column 10 and greater should be inspected for values > 1.
After looking at many different Stack Overflow posts and Pandas documentation, I tried:
df.iloc[df[:,10: ] > 1] = 1
However, this gives me the error “unhashable type: ‘slice’”.
I then tried:
df[df.iloc[:, 10:] > 1] = 1
df[df.loc[:, df.columns[10:]] > 1] = 1
as per 2 suggestions in the comments, but both of those give me the error “Cannot do inplace boolean setting on mixed-types with a non np.nan value”.
Does anyone know why I’m getting these errors and/or what I should change about my code to avoid them?
We can use
iloc to select all the columns to the right of
9th column, then using
where we can replace the values in the slice of dataframe where the condition
df.iloc[:, 10:] = df.iloc[:, 10:].where(lambda x: x.le(1), 1)
Alternatively we can use
clip where we can define the
upper limit as
1 which assigns all the values greater than
1 in the slice of dataframe to
df.iloc[:, 10:] = df.iloc[:, 10:].clip(upper=1)