Function drops invalid values in dataframes but then it returns original dataframes with invalid values

I created this simple function:

def cleanup_data(*argv):
    for df in argv:
        df = df.dropna()
    return argv

However, if I call cleanup_data(df1, df2), and later I do:

df1.isnull().values.any()

or

df2.isnull().values.any()

I get True.

What is wrong with my code?

Answer

You are not returning the updated dataframes, but rather the unchanged argv. Here’s how you could return a list of updated dataframes using a list comprehension:

def cleanup_data(*argv):
    return [df.dropna() for df in argv]

Alternatively you could make df.dropna operate in-place on the dataframes:

def cleanup_data(*argv):
    for df in argv:
        df.dropna(inplace=True)