lookup within filtered range

I have a dataframe with data from ecommerce panel. It has orders and returns mixed together. Each row has orderID – it’s the same number for normal orders and for corresponding returns that come back from customers.

My data looks like this:

orderID Shop Revenue Note
44 0 -32 Return
45 0 -100 Return
44 1 14
45 3 20 Something else
46 2 50
47 1 80 Something
48 2 222

For each return I want to find a ‘Shop’ column value that corresponds to original order. For example : ‘orderID’ == 44 comes twice: once as return (with ‘Shop’ == 0) and once as normal order (with ‘Shop’ == 1). I want to replace all the 0 values with ‘Shop’ column with values from earlier orders

My desired output looks like this:

orderID Shop Revenue Note
44 1 -32 Return
45 3 -100 Return
44 1 14
45 3 20 Something else
46 2 50
47 1 80 Something
48 2 222

I know how to do it in Google Sheets (first I filter table removing ‘Shop’==0 values and then I vlookup for numbers in this filtered array)

I know how to filter this table using Pandas but I don’t know how to write it.

I assume that I will need to write a temporary column first, where I store both types of values – for normal orders (just copied) and for returns.

Original dataframe is 1 000 000+ rows

My data in .csv is available here: https://docs.google.com/spreadsheets/d/e/2PACX-1vQAJ4tMc_Bcvv-4FsUy3E7sG0m9hm-nLTVLj-LwlSEns-YJ1pbq6gSKp5mj5lZqRI2EgHOsOutwnn1I/pub?gid=0&single=true&output=csv

Thank you for any advice!

Answer

IIUC, using map:

m = df.query('Shop != 0').set_index('orderID')['Shop']
df['Shop'] = df['orderID'].map(m)
print(df)

Output:

   orderID  Shop  Revenue            Note
0       44     1      -32          Return
1       45     3     -100          Return
2       44     1       14             NaN
3       45     3       20  Something else
4       46     2       50             NaN
5       47     1       80       Something
6       48     2      222             NaN

Create a pd.Series using query to filter out zero shops then set_index and map shops to orderID​.

This works if there is a 1-1 shop to order mapping. If you have multiple shops per order, then you’ll need logic to determine which shop valid.

If you have duplicate order to the same shop, then you need to drop_duplicates first.