Method chaining with pandas function

Why can’t I chain the get_dummies() function?

import pandas as pd

df = (pd
     .read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv')
     .drop(columns=['sepal_length'])
     .get_dummies()
)

This works fine:

df = (pd
     .read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv')
     .drop(columns=['sepal_length'])
)
df = pd.get_dummies(df)

Answer

DataFrame.pipe can be helpful in chaining methods or function calls which are not natively attached to the DataFrame, like pd.get_dummies:

df = df.drop(columns=['sepal_length']).pipe(pd.get_dummies)

Or with lambda:

df = (
    df.drop(columns=['sepal_length'])
        .pipe(lambda current_df: pd.get_dummies(current_df))
)

Sample DataFrame:

df = pd.DataFrame({'sepal_length': 1, 'a': list('ABACC'), 'b': list('ACCAB')})

df:

   sepal_length  a  b
0             1  A  A
1             1  B  C
2             1  A  C
3             1  C  A
4             1  C  B

Sample Output:

df = df.drop(columns=['sepal_length']).pipe(pd.get_dummies)

df:

   a_A  a_B  a_C  b_A  b_B  b_C
0    1    0    0    1    0    0
1    0    1    0    0    0    1
2    1    0    0    0    0    1
3    0    0    1    1    0    0
4    0    0    1    0    1    0