How can I flatten a json in pandas with wildcard?

Given this json:

"World_Regions": {
    "Americas": {
        "0": {
            "Name": "North America",
            "Category_Average": "54.53",
            "Stocks_%": "55.44",
            "Benchmark": "59.02"
        },
        "1": {
            "Name": "Latin America",
            "Category_Average": "0.87",
            "Stocks_%": "1.14",
            "Benchmark": "0.93"
        }
    },
    "Greater Asia": {
        "0": {
            "Name": "Japan",
            "Category_Average": "6.58",
            "Stocks_%": "3.74",
            "Benchmark": "7.76"
        },
        "1": {
            "Name": "Australasia",
            "Category_Average": "1.79",
            "Stocks_%": "7.45",
            "Benchmark": "2.17"
        },
        "2": {
            "Name": "Asia Developed",
            "Category_Average": "5.56",
            "Stocks_%": "7.27",
            "Benchmark": "4.57"
        },
        "3": {
            "Name": "Asia Emerging",
            "Category_Average": "6.63",
            "Stocks_%": "2.96",
            "Benchmark": "6.58"
        }
    },

I want to get this result:

Name    Category_Average    Stocks_%    Benchmark
0   North America   54.53   55.44   59.02
1   Latin America   0.87    1.14    0.93
2   Japan   6.58    3.74    7.76
3   Australasia 1.79    7.45    2.17
4   Asia Developed  5.56    7.27    4.57
6   Asia Emerging   6.63    2.96    6.58

but unfortunately the different names for region name(Americas/Greater Asia) is causing a problem. I am trying to this cleanly in one command right now I can get the result by doing this:

pd.DataFrame.from_dict(jsonFile['World_Regions']['Greater Asia']).transpose()
        Name    Category_Average    Stocks_%    Benchmark
    0   Japan   6.58    3.74    7.76
    1   Australasia 1.79    7.45    2.17
    2   Asia Developed  5.56    7.27    4.57
    3   Asia Emerging   6.63    2.96    6.58

then the same for Americas then merge the dataframes. Is there a way to do it that’s more direct(i.e. one command?)

Answer

You can try concatenate the individual values:

pd.concat([pd.DataFrame(x).T for x in a['World_Regions'].values()],
          ignore_index=True)

Or flatten the data before passing to DataFrame:

pd.DataFrame([y for x in a['World_Regions'].values()
                for y in x.values() ])

Output:

             Name Category_Average Stocks_% Benchmark
0   North America            54.53    55.44     59.02
1   Latin America             0.87     1.14      0.93
2           Japan             6.58     3.74      7.76
3     Australasia             1.79     7.45      2.17
4  Asia Developed             5.56     7.27      4.57
5   Asia Emerging             6.63     2.96      6.58

Leave a Reply

Your email address will not be published. Required fields are marked *