I add file names into dataframe but it adds only the same name

I have a lot of csv files to open and I need to add an extra column with name of those files. For example I have x.csv, y.csv, z.csv and etc. Inside csv file it looks like below:

X  Z
1 3
4 5
4 6

And it should look like this

    X  Z name
    1 3  x
    4 5  x
    4 6  x
    4 5  y
    4 5  y
    1 2  y 

My code is below but it returns only 1 value…

import pandas as pd
import os
import rglob

file_list = rglob.rglob("path", "*")
    
li = []
    
for path in file_list:
    df = pd.read_csv(path, index_col=None, header=0,)
    file_name = os.listdir('path')[0]
    df["file_name"] = file_name
    li.append(df)

Any idea how could I fix it?

Best regards

Answer

Your os.listdir is wrong. os.listdir returns a list of files in the directory. You should be using os.basename or pathlib.Path.name

With pathlib:

import pandas as pd
from pathlib import Path

file_list = Path("path").rglob("*.csv")
    
li = []
    
for path in file_list:
    df = pd.read_csv(path, index_col=None, header=0,)
    df["file_name"] = path.name
    li.append(df)

Leave a Reply

Your email address will not be published. Required fields are marked *