Cleaning Excel File Using Pandas

I have an excel file which I’ve read using Pandas the output is as follows:

+------------+-----+-----+-----+
| Type       | 1   | 2   | 3   |
| Category   | A   | A   | C   |
| Dates      | NaN | NaN | NaN |
| 01/01/2021 | 12  | 12  | 9   |
| 02/01/2021 | 10  | 10  | 2   |
| 03/01/2021 | 30  | 16  | NaN |
| 04/01/2021 | 15  | 23  | 4   |
| 05/01/2021 | 14  | 20  | 5   |
+------------+-----+-----+-----+

The first two rows gives information for each time series column wise. So for column 1 the Type is 1 and Category is A. I want to melt the time series and not quite sure how to approach the problem given the structure of the sheet.

Expected Output:

+------------+-------+----------+------+
|   Dates    | Price | Category | Type |
+------------+-------+----------+------+
| 01/01/2021 |    12 | A        |    1 |
| 02/01/2021 |    10 | A        |    1 |
| 03/01/2021 |    30 | A        |    1 |
| 04/01/2021 |    15 | A        |    1 |
| 05/01/2021 |    14 | A        |    1 |
| 01/01/2021 |    12 | B        |    2 |
| 02/01/2021 |    10 | B        |    2 |
| 03/01/2021 |    16 | B        |    2 |
| 04/01/2021 |    23 | B        |    2 |
| 05/01/2021 |    20 | B        |    2 |
| 01/01/2021 |     9 | C        |    3 |
| 02/01/2021 |     2 | C        |    3 |
| 04/01/2021 |     4 | C        |    3 |
| 05/01/2021 |     5 | C        |    3 |
+------------+-------+----------+------+

In case of Type 3 and Category C since value is NaN we drop that date. How can achieve the expected output?

Answer

Assuming the following input dataframe:

         col0 col1 col2 col3
0        Type    1    2    3
1    Category    A    A    C
2       Dates  NaN  NaN  NaN
3  01/01/2021   12   12    9
4  02/01/2021   10   10    2
5  03/01/2021   30   16  NaN
6  04/01/2021   15   23    4
7  05/01/2021   14   20    5

Here is a working pipeline:

(df.iloc[3:]
   .set_index('col0').rename_axis('Date') # set first column aside
   # next 3 lines to rename columns index
   .T
   .set_index(pd.MultiIndex.from_arrays(df.iloc[:2, 1:].values, names=df.iloc[:2, 0]))
   .T
   .stack(level=[0,1]) # columns to rows
   .rename('Price')    # rename last unnamed column
   .reset_index()      # all indexes back to columns
)

output:

          Date Type Category Price
0   01/01/2021    1        A    12
1   01/01/2021    2        A    12
2   01/01/2021    3        C     9
3   02/01/2021    1        A    10
4   02/01/2021    2        A    10
5   02/01/2021    3        C     2
6   03/01/2021    1        A    30
7   03/01/2021    2        A    16
8   04/01/2021    1        A    15
9   04/01/2021    2        A    23
10  04/01/2021    3        C     4
11  05/01/2021    1        A    14
12  05/01/2021    2        A    20
13  05/01/2021    3        C     5