Mapping DataFrame values to another DataFrame based on index ‘month’ and column

I have two DataFrames. df_A is indexed with end of month values and df_B is indexed daily, but has an end of month column for reference. The [column] for which values that I’m looking to extract need to be “looked-up” via df_B[‘ref’] and matched based upon [‘month’] in each df.

DataFrame: df_A

month colA colB colC colD colE
2000-01-31 val val val val val
2000-02-29 val val val val val
2000-03-31 val val val val val

DataFrame df_B

date month ref result
2000-01-01 2000-01-31 colA val from df_A
2000-01-02 2000-01-31 colD val from df_A
2000-01-03 2000-01-31 colB val from df_A
2000-01-04 2000-01-31 colC val from df_A

What’s the Pythonic way of achieving df_B[‘result’]? Is there a list comprehension or lambdas solution that could do this without resorting to a massive, compute-intensive for/if/loop.

P.S. I asking because I’m currently already into a couple nested for-loops as the overarching real life problem is a bit more complicated. Stacking additional loops makes my brain hurt…

Answer

Your df_A data needs to be converted from a wide format to a long format, you can do that effectively here with pd.melt

The melted data looks something like this:

         month    ref   value
0   2000-01-31  colA    val
1   2000-02-29  colA    val
2   2000-03-31  colA    val
3   2000-01-31  colB    val

Where you can see that the column names are melted into a single column, which then makes it easy to join with your other dataframe.

df_B.merge(df_A.melt(id_vars='month', var_name='ref'), on=['month','ref'] )

Output

         date       month   ref         result value
0  2000-01-01  2000-01-31  colA  val from df_A   val
1  2000-01-02  2000-01-31  colD  val from df_A   val
2  2000-01-03  2000-01-31  colB  val from df_A   val
3  2000-01-04  2000-01-31  colC  val from df_A   val