Pandas version discrepancy, pd.concat(sort=False)

I’ve written some code for someone else.

At a certain point in the code, I use:

new_df = pd.concat([df1, df2, df3], sort=False)

However the person who is actually running the code is using an older version of Pandas which is not compatible with the ‘sort’ parameter. They also do not have admin rights and so cannot update the version of Pandas (20.0.1) that they are using.

How can I resolve this? Is there a workaround in the code which would prevent his concat function from automatically ordering the columns alphabetically despite not having the later version of Pandas?

Answer

You can concatenate the dataframes and then rearrange the columns afterwards:

new_df = pd.concat([df1, df2, df3])
print(
    new_df[df1.columns.tolist() + df2.columns.tolist() + df3.columns.tolist()]
)

EDIT: To filter out duplicated columns (when some overlap):

new_df = pd.concat([df1, df2, df3])

out, seen = [], set()
for c in df1.columns.tolist() + df2.columns.tolist() + df3.columns.tolist():
    if c not in seen:
        out.append(c)
        seen.add(c)

print(new_df[out])