How to find the nearest node using BFS?

Let G(V,E) be an undirected unweighted graph and r be a subset of V. Now node root is added to G and edges are added between root and all the nodes of r. Now for each node of V-r I want to find the nearest node of r using BFS. Please help. I have tried the following code.

import networkx as nx
import matplotlib.pyplot as plt

def bfs(g, node):
    dist = 0
    visited = [node]
    queue = [(node, dist)]
    tr = {}
    while queue:
        s, dist = queue.pop(0)
        tr[s] = []
        for nbr in list(g.adj[s]):
            if nbr not in visited:
                visited.append(nbr)
                tr[s].append(nbr, dist+1)
                queue.append((nbr, dist+1))
    return tr

G=nx.erdos_renyi_graph(50,0.1)

r=[5,8,36,43,21]
G.add_node('root')
for i in r:
    G.add_edge(i,'root')

t = bfs(G, 'root')
print(t)

Answer

I did not understand the purpose of the root node in this problem. I think, the easiest way to solve this problem is, you call bfs from all the V-r nodes. Each time, when you able to reach a node that belongs to r, you return it. Because it is the first reachable node that belongs to r. Here is the sample of this process:

import networkx as nx
import matplotlib.pyplot as plt

def bfs(g, node, r):
    dist = 0
    visited = [node]
    queue = [(node, dist)]
    #tr = {}
    while queue:
        s, dist = queue.pop(0)
        #tr[s] = []
        for nbr in list(g.adj[s]):
            if nbr not in visited:
                if nbr in r:
                    return (nbr, dist+1)
                
                visited.append(nbr)
                #tr[s].append((nbr, dist+1))
                queue.append((nbr, dist+1))
    #return tr
    return (NaN, NaN)

G=nx.erdos_renyi_graph(50,0.1)

r=[5,8,36,43,21]
G.add_node('root')
for i in r:
    G.add_edge(i,'root')

for n in list(G.nodes):
    if n not in r:
        t, d = bfs(G, n, r)
        print("Node {}'s nearest node in r: {} with distance: {}".format(n, t, d))

As erdos_renyi is a random graph, so it can give different result in different run. Here is a sample output:

Node 0's nearest node in r: 5 with distance: 2
Node 1's nearest node in r: 8 with distance: 1
Node 2's nearest node in r: 43 with distance: 2
Node 3's nearest node in r: 36 with distance: 2
Node 4's nearest node in r: 36 with distance: 2
Node 6's nearest node in r: 5 with distance: 2
Node 7's nearest node in r: 36 with distance: 2
Node 9's nearest node in r: 36 with distance: 1
Node 10's nearest node in r: 36 with distance: 2
Node 11's nearest node in r: 8 with distance: 2
Node 12's nearest node in r: 8 with distance: 3
Node 13's nearest node in r: 8 with distance: 2
Node 14's nearest node in r: 8 with distance: 2
Node 15's nearest node in r: 36 with distance: 3
Node 16's nearest node in r: 36 with distance: 3
Node 17's nearest node in r: 8 with distance: 1
Node 18's nearest node in r: 8 with distance: 1
Node 19's nearest node in r: 8 with distance: 1
Node 20's nearest node in r: 21 with distance: 1
Node 22's nearest node in r: 36 with distance: 2
Node 23's nearest node in r: 21 with distance: 2
Node 24's nearest node in r: 21 with distance: 1
Node 25's nearest node in r: 5 with distance: 1
Node 26's nearest node in r: 5 with distance: 1
Node 27's nearest node in r: 36 with distance: 3
Node 28's nearest node in r: 36 with distance: 2
Node 29's nearest node in r: 5 with distance: 1
Node 30's nearest node in r: 21 with distance: 1
Node 31's nearest node in r: 43 with distance: 1
Node 32's nearest node in r: 36 with distance: 3
Node 33's nearest node in r: 5 with distance: 2
Node 34's nearest node in r: 8 with distance: 2
Node 35's nearest node in r: 36 with distance: 2
Node 37's nearest node in r: 36 with distance: 3
Node 38's nearest node in r: 43 with distance: 1
Node 39's nearest node in r: 8 with distance: 2
Node 40's nearest node in r: 43 with distance: 1
Node 41's nearest node in r: 43 with distance: 2
Node 42's nearest node in r: 8 with distance: 2
Node 44's nearest node in r: 43 with distance: 2
Node 45's nearest node in r: 43 with distance: 2
Node 46's nearest node in r: 5 with distance: 2
Node 47's nearest node in r: 8 with distance: 1
Node 48's nearest node in r: 36 with distance: 1
Node 49's nearest node in r: 5 with distance: 2
Node root's nearest node in r: 5 with distance: 1

You can even further optimize this solution. First, create a list for the V-r nodes which will store the shortest distance to reach this nodes from r nodes. Initialize this list with some large (i.e., infinite) value. Now, instead of calling bfs for every V-r node, you can call bfs from all the r nodes and update the distance list if possible. By this process, you will make less call to bfs if len(r) << len(V-r). I hope this solves your problem.