Plot a histogram where the bars are coloured based on a second list of values

I can plot a histogram in Python for example with matplotlib:

from matplotlib import pyplot as plt
x = [3,5,12,7,8,6,4,6]
plt.hist(x)

However I have a second array y = [4,6,8,2,4,5,8,7] where each value corresponds to the value at the same position of x. Now I would like to create a histogram where each bar’s height is defined by x, but each bar’s color is defined by the values in y that belong to its x values. You could also say I have tuples as in list(zip(x,y)) where the first value should be used for the histogram itself and the mean value of the second tuple value in each bin should determine the color.

Answer

np.unique(x, return_counts=True) returns an array with the unique values of x and their count.

Converting everything to numpy arrays, y[x == val] selects the subset of y at each position where x is equal to val. y[x == val].mean() gets the mean of those values. Calling cmap(norm(...)) gives the color corresponding to that value. The cmap and norm can be used to create a colorbar.

Here is some example code, including embellishments to change ticks, margins and spines:

import matplotlib.pyplot as plt
from matplotlib.ticker import MultipleLocator
from matplotlib.cm import ScalarMappable
import numpy as np

x = np.array([3, 5, 12, 7, 8, 6, 4, 6])
y = np.array([4, 6, 8, 2, 4, 5, 8, 7])

values, counts = np.unique(x, return_counts=True)
cmap = plt.get_cmap('inferno')
norm = plt.Normalize(0, y.max())  # or plt.Normalize(y.min(), y.max())
colors = [cmap(norm(y[x == val].mean())) for val in values]
fig, ax = plt.subplots()
ax.bar(values, counts, color=colors, edgecolor='black')
ax.yaxis.set_major_locator(MultipleLocator(1))
ax.xaxis.set_major_locator(MultipleLocator(1))
ax.set_ylabel('Count')
ax.margins(x=0.02, y=0)
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
plt.colorbar(ScalarMappable(cmap=cmap, norm=norm), pad=0.02, ax=ax)
plt.show()

histogram with colors from other array

Here is another example, using the tips dataset from seaborn, with the rounded total_bill on the x-axis, the count on the y-axis and colored via the tip amount.

import seaborn as sns

tips = sns.load_dataset('tips')
x = np.round(tips['total_bill'])
y = np.array(tips['tip'])

values, counts = np.unique(x, return_counts=True)
cmap = plt.get_cmap('turbo')

histogram with total_amount on x colored by tip

PS: As mentioned in @Arne’s answer, seaborn can be used to replace the norm and color assignment with seaborn’s hue. Without embelishments, the code would look like:

import numpy as np
import seaborn as sns

x = np.array([3, 5, 12, 7, 8, 6, 4, 6])
y = np.array([4, 6, 8, 2, 4, 5, 8, 7])

values, counts = np.unique(x, return_counts=True)
sns.set_style('darkgrid')
ax = sns.barplot(x=values, y=counts, hue=[y[x == val].mean() for val in values],
                 palette='inferno', dodge=False)

seaborn barplot