I can plot a histogram in Python for example with matplotlib:
from matplotlib import pyplot as plt x = [3,5,12,7,8,6,4,6] plt.hist(x)
However I have a second array
y = [4,6,8,2,4,5,8,7] where each value corresponds to the value at the same position of
x. Now I would like to create a histogram where each bar’s height is defined by
x, but each bar’s color is defined by the values in
y that belong to its
x values. You could also say I have tuples as in
list(zip(x,y)) where the first value should be used for the histogram itself and the mean value of the second tuple value in each bin should determine the color.
np.unique(x, return_counts=True) returns an array with the unique values of
x and their count.
Converting everything to numpy arrays,
y[x == val] selects the subset of
y at each position where
x is equal to
y[x == val].mean() gets the mean of those values. Calling
cmap(norm(...)) gives the color corresponding to that value. The
norm can be used to create a colorbar.
Here is some example code, including embellishments to change ticks, margins and spines:
import matplotlib.pyplot as plt from matplotlib.ticker import MultipleLocator from matplotlib.cm import ScalarMappable import numpy as np x = np.array([3, 5, 12, 7, 8, 6, 4, 6]) y = np.array([4, 6, 8, 2, 4, 5, 8, 7]) values, counts = np.unique(x, return_counts=True) cmap = plt.get_cmap('inferno') norm = plt.Normalize(0, y.max()) # or plt.Normalize(y.min(), y.max()) colors = [cmap(norm(y[x == val].mean())) for val in values] fig, ax = plt.subplots() ax.bar(values, counts, color=colors, edgecolor='black') ax.yaxis.set_major_locator(MultipleLocator(1)) ax.xaxis.set_major_locator(MultipleLocator(1)) ax.set_ylabel('Count') ax.margins(x=0.02, y=0) ax.spines['top'].set_visible(False) ax.spines['right'].set_visible(False) plt.colorbar(ScalarMappable(cmap=cmap, norm=norm), pad=0.02, ax=ax) plt.show()
Here is another example, using the
tips dataset from seaborn, with the rounded total_bill on the x-axis, the count on the y-axis and colored via the tip amount.
import seaborn as sns tips = sns.load_dataset('tips') x = np.round(tips['total_bill']) y = np.array(tips['tip']) values, counts = np.unique(x, return_counts=True) cmap = plt.get_cmap('turbo')
PS: As mentioned in @Arne’s answer, seaborn can be used to replace the norm and color assignment with seaborn’s
hue. Without embelishments, the code would look like:
import numpy as np import seaborn as sns x = np.array([3, 5, 12, 7, 8, 6, 4, 6]) y = np.array([4, 6, 8, 2, 4, 5, 8, 7]) values, counts = np.unique(x, return_counts=True) sns.set_style('darkgrid') ax = sns.barplot(x=values, y=counts, hue=[y[x == val].mean() for val in values], palette='inferno', dodge=False)