I have two DataFrames. df_A is indexed with end of month values and df_B is indexed daily, but has an end of month column for reference. The [column] for which values that I’m looking to extract need to be “looked-up” via df_B[‘ref’] and matched based upon [‘month’] in each df.
|2000-01-01||2000-01-31||colA||val from df_A|
|2000-01-02||2000-01-31||colD||val from df_A|
|2000-01-03||2000-01-31||colB||val from df_A|
|2000-01-04||2000-01-31||colC||val from df_A|
What’s the Pythonic way of achieving df_B[‘result’]? Is there a list comprehension or lambdas solution that could do this without resorting to a massive, compute-intensive for/if/loop.
P.S. I asking because I’m currently already into a couple nested for-loops as the overarching real life problem is a bit more complicated. Stacking additional loops makes my brain hurt…
df_A data needs to be converted from a wide format to a long format, you can do that effectively here with
The melted data looks something like this:
month ref value 0 2000-01-31 colA val 1 2000-02-29 colA val 2 2000-03-31 colA val 3 2000-01-31 colB val
Where you can see that the column names are melted into a single column, which then makes it easy to join with your other dataframe.
df_B.merge(df_A.melt(id_vars='month', var_name='ref'), on=['month','ref'] )
date month ref result value 0 2000-01-01 2000-01-31 colA val from df_A val 1 2000-01-02 2000-01-31 colD val from df_A val 2 2000-01-03 2000-01-31 colB val from df_A val 3 2000-01-04 2000-01-31 colC val from df_A val