Shifting Levels in Nested Dictionary in Python

I have some data from a remote server in a deeply nested dictionary in Python. Due to the data collection process (which I cannot control), several of the levels of this dictionary get wrapped in an unnecessary dictionary with a single key called "__collections__". For example, the dictionary looks like this

{"data_level_1":
    {"__collections__":
        {"data_level_2": ...}
    }
}

when really what I would like is

{"data_level_1":
    {"data_level_2": ...}
}

I need a way to recursively iterate through the nested dictionary to “shift” those wrapped dictionaries one level up while getting rid of the "__collections__" wrapper dictionary. Here is my attempt:

import collections.abc

def remove_repeat_named_level(dictionary, key):
    q = list(dictionary.items())
    for v, d in q:
        if isinstance(d, MutableMapping):
            for nv, nd in d.items():
                if isinstance(nd, MutableMapping):
                    if v==key:
                        nd = remove_repeat_named_level(nd, key)
                        q.append((nv, nd))
                        if (v, d) in q: q.remove((v, d))
                    elif nv==key:
                        nd = remove_repeat_named_level(nd, key)
                        q.append((v, nd))
                        if (v, d) in q: q.remove((v, d))
                elif v==key:
                    q.append((nv, nd))
                    if (v, d) in q: q.remove((v, d))
    return dict(q)

where dictionary is the nested dictionary and key is the name of the single key in the wrapper dictionaries I want to remove (in this case, collections).

This works great on a few simple test cases, for example:

test_key = "A"
test_dict = {"A": 
                {"B": 
                    {"A": 
                        {"i": 
                            {"A": 
                                {"One": 
                                    {"A": 
                                        {"alpha": "a", 
                                         "beta": "b"
                                        }
                                    }, 
                                 "Two": 2
                                 }
                             }, 
                         "ii": 2
                         }
                     }
                 }
             }

remove_repeat_named_level(test_dict, test_key)

returns the expected result:

{'B': {'i': {'One': {'alpha': 'a', 'beta': 'b'}, 'Two': 2}, 'ii': 2}

However, when I pass the nested dictionary with my data through the function, the recursion just seems to stop at some level:

Dictionary:

test_d2 = {"__collections__":
               {"tasks": 
                  {"task1":
                       {"__collections__":
                          {"subjects":
                              {"subject1":
                                  {"date": 1,
                                   "time": 1,
                                   "__collections__":
                                       {"surveys":
                                           {"survey1":
                                               {"survey_data":
                                                    {"Q1": {"Response": 1},
                                                     "Q2": {"Response": 2}
                                                    }
                                               },
                                                "__collections__": {}
                                           }
                                       }
                                  }
                              }
                          }
                       }
                  }
               }
         }

Expected:

               {"tasks": 
                    {"task1":
                          {"subjects":
                              {"subject1":
                                  {"date": 1,
                                   "time": 1,
                                   {"surveys":
                                       {"survey1":
                                           {"survey_data":
                                                {"Q1": {"Response": 1},
                                                 "Q2": {"Response": 2}
                                                }
                                           }
                                       }
                                    }
                                  }
                              }
                          }
                    }
               }

Result:

{"tasks": 
              {"task1":
                  {"subjects":
                      {"subject1":
                          {"date": 1,
                           "time": 1,
                           "__collections__":
                               {"surveys":
                                   {"survey1":
                                       {"survey_data":
                                            {"Q1": {"Response": 1},
                                             "Q2": {"Response": 2}
                                            }
                                       },
                                        "__collections__": {}
                                   }
                               }
                          }
                      }
                  }
              }
         }

Been wracking my brain for hours trying to figure this out. Why does the recursion just stop at some point? Is there some case I’m not accounting for?

Answer

The difference between your small example and your actual data is that in your small example, the bad key is always alone in its dict; whereas in your actual data, the bad key is sometimes mixed in with other keys in a dict.

You can simplify your recursive function a lot:

def remove_repeat_named_level(d, bad_key):
  if not isinstance(d, dict):
    return d
  else:
    new_d = {k: remove_repeat_named_level(v, bad_key) for k,v in d.items() if k != bad_key}
    if bad_key in d:
      new_d.update(remove_repeat_named_level(d[bad_key], bad_key))
    return new_d

Testing with your data:

>>> remove_repeat_named_level(test_d2, '__collections__')
{'tasks': {'task1': {'subjects': {'subject1': {'date': 1, 'time': 1, 'surveys': {'survey1': {'survey_data': {'Q1': {'Response': 1}, 'Q2': {'Response': 2}}}}}}}}}