I’m trying to take some rows that are classified as outliers, and remove these rows from the original dataset, but I can’t make it work – do you guys know what goes wrong? I try to run the followin code, and get this error “ValueError: Index data must be 1-dimensional”
#identify outliers pred = iforest.fit_predict(x) outlier_index = np.where(pred==-1) outlier_values = x.iloc[outlier_index] #remove from dataset (dataset = x) x_new = x.drop([outlier_values])
The outlier_values you linked is a dataframe not a flat list of indexes, so the value error is thrown accordingly.
What you need to do is to extract the list of indexes from the outlier_values dataframe, using:
index_list = outlier_values.index.values.tolist()
into a list of indexes and then drop those indexes from x.
as in this answer