Manipulating/Filtering CSV files python/pandas

I have two csv files representing the same data : source1.csv and source2.csv

source1.csv looks like this:

id name url link
1111 Alex aaaa eeee
2222 Dan bbbb ffff
3333 Jack cccc gggg

source2.csv looks like this: (where both columns url and link are empty)

location id url name link
xxxx 1111 Alex
xyxy 9999 George
zyzy 8888 Sam
zzyy 2222 Dan
xxyy 7777 Adam
xzyz 3333 Jack

Right now, I want to go over the source1.csv file and read the data somehow, then populate the rows in source2.csv with corresponding values for the url and link columns based on the value of the unique field id, so that I get something like this:

location id url name link
xxxx 1111 aaaa Alex eeee
xyxy 9999 George
zyzy 8888 Sam
zzyy 2222 bbbb Dan ffff
xxyy 7777 Adam
xzyz 3333 cccc Jack gggg

How can I do this?

Answer

Read each file into a dataframe and then merge the relevant columns from each together using id as the key.

import pandas as pd 

df1 = pd.read_csv('source1.csv')
df2 = pd.read_csv('source2.csv')

print(df1)
print(df2)

df = df2[['location', 'id','name']].merge(df1[['id','url', 'link']], how='left', on='id').fillna(' ')

print(df)

Leave a Reply

Your email address will not be published. Required fields are marked *