Vertorize Matching Two dataframe datetimeindex comparison

I have two datetime ordered dataframes using a DatetimeIndex (timestamp) as per below.

df1
timestamp                            price  side  
2021-08-27 12:45:00.475100160+00:00  47.34  
2021-08-27 12:45:00.475100160+00:00  47.02 
2021-08-27 12:45:00.488067957+00:00  47.18 
2021-08-27 12:45:00.779297294+00:00  47.26 
2021-08-27 12:45:00.779297294+00:00  47.27 

df2
timestamp                            bid_price  ask_price   
2021-08-27 12:44:59.740064471+00:00  47.08  47.34
2021-08-27 12:45:00.475100160+00:00  47.02  47.34
2021-08-27 12:45:00.914411789+00:00  47.02  47.26
2021-08-27 12:45:00.915470114+00:00  47.02  47.34

I need to compare the datetimeIndex of each row in the first dataframe (df1) against the datetimeIndex of the second dataframe (df2). The first row in df2 which has a datetime that is equal to or below the datetimeindex of the row in df1 will be used to evaluate the columns of df2.bid_price and df2.ask_price against the column df1.price. If df1.price == df2.bid_price then add ‘Bid’ to the df1.side column. If df1.price == df2.ask_price then add ‘Ask’ to the df1.side column. If df1.price is between dff2.ask_price and df2.bid_price then add ‘Inside’ to df1.side column else add ‘Outside’ to the df1.side column.

My code below is the least efficient way of doing this by iterating through every row of df1 and comparing it to df2. In short, it takes forever when I start looking at anything over 10-20k rows. I was looking for more efficient ways of doing this.

for x in range(len(df1)):
    price = df1.price.iloc[x]
    quote = df2[(df1.index[x] >= df2.index)][['bid_price','ask_price']].iloc[-1]
    if price == quote.bid_price:
        df1.side.iloc[x] = 'Bid'
    elif price == quote.ask_price:
        df1.side.iloc[x] = 'Ask'
    elif (price > quote.bid_price) & (price < quote.ask_price):
        df1.side.iloc[x] = 'Inside'
    else:
        df1.side.iloc[x] = 'Outside' 

Answer

Here is a working solution using pandas.merge_asof to merge the timestamps and numpy.select to match the various conditions:

import numpy as np
df3 = pd.merge_asof(df1, df2, on='timestamp', direction='backward')
df3['side'] = np.select([df3['price']==df3['bid_price'], 
                         df3['price']==df3['ask_price'],
                         df3['price'].between(df3['bid_price'], df3['ask_price'])
                         ],
                        ['Bid', 'Ask', 'Inside'],
                        default='Outside'
                        )

output:

>>> df3
                            timestamp  price    side  bid_price  ask_price
0 2021-08-27 12:45:00.475100160+00:00  47.34     Ask      47.02      47.34
1 2021-08-27 12:45:00.475100160+00:00  47.02     Bid      47.02      47.34
2 2021-08-27 12:45:00.488067957+00:00  47.18  Inside      47.02      47.34
3 2021-08-27 12:45:00.779297294+00:00  47.26  Inside      47.02      47.34
4 2021-08-27 12:45:00.779297294+00:00  47.27  Inside      47.02      47.34

NB. if needed, you can drop the intermediate columns: df3.drop(['bid_price', 'ask_price'], axis=1)