Node’s attributes from a different dataset

I need to merge some info from different datasets to build a dataset that includes nodes, edges and nodes’ attributes.

The first dataset (df1) is something like this:

Node     Edge
A         B
A         D
B         N
B         A
B         X
S         C

The second dataset includes unique nodes and their properties:

Node    Attribute
A           -1
B           0
C           -1.5
D           1
...
N           1
... 
X           0
Y           -1.5
W          -1.5
Z           1

I would like to create a graph using the links as in the first dataset (df1), coloring nodes based on the attribute values:

 - if -1.5 then grey    
 - if -1 then red
 - if 0 then orange
 - if 1 then yellow
 - if 1.5 then green    

For building the graph I can use

G = nx.from_pandas_edgelist(edges, source='Node', target='Edge')

Then I will need to set the above rules for assigning the color and add them as attribute of nodes. My question is on how I can include these rules as node’s attribute.

Answer

The main issue is to get the attributes DataFrame into something more usable. We can create a mapping from Node to colour by set_index to Node, and map the current attribute numerical value into a colour:

import networkx as nx
import pandas as pd

edges_df = pd.DataFrame({
    'Node': ['A', 'A', 'B', 'B', 'B', 'S'],
    'Edge': ['B', 'D', 'N', 'A', 'X', 'C']
})

# Abridged but contains values for all nodes in `edges_df`
attributes_df = pd.DataFrame({
    'Node': ['A', 'B', 'C', 'D', 'N', 'S', 'X'],
    'Attribute': [-1, 0, -1.5, 1, 1, 1.5, 0]
})

mapper = {-1.5: 'grey', -1: 'red', 0: 'orange', 1: 'yellow', 1.5: 'green'}
colour_map = attributes_df.set_index('Node')['Attribute'].map(mapper)

colour_map:

Node
A       red
B    orange
C      grey
D    yellow
N    yellow
S     green
X    orange
Name: Attribute, dtype: object

*Note: Neither the value 1.5 nor node S were represented in the above Attributers Dataset so to have all nodes with colour and to have all colours represented S was set to 1.5 in attributes_df


colour_map can then be used to either set_node_attributes:

G = nx.from_pandas_edgelist(edges_df, source='Node', target='Edge')
# Add Attribute to each node
nx.set_node_attributes(G, colour_map, name="colour")

# Then draw with colours based on attribute values:
nx.draw(G, 
        node_color=nx.get_node_attributes(G, 'colour').values(),
        with_labels=True)

Or we can use the Series directly (without creating node attributes) by reindexing based on the Graph nodes to ensure colours appear in the same order as they do in G.nodes() this makes sure the correct colours align with the correct nodes:

G = nx.from_pandas_edgelist(edges_df, source='Node', target='Edge')

# Then draw with colours based on the Series:
nx.draw(G, 
        node_color=colour_map.reindex(G.nodes()),
        with_labels=True)

Either approach we get a graph like:

coloured and labeled plot