I’m trying to create a scatter plot that, on the graph, you can differentiate two things:
By color. For example, if the value is negative the color is red and if the value is positive the color is blue.
By marker size. For example, if the value it’s between -0.20 and 0 size is 100, if the value is between 0 and 0.1 size is 200, if the value is between 0.5 and 1 size is 300 and so on…
Here is some of the data I’m working on (just in case):
0 0.15 1 0.04 2 0.02 3 0.01 4 -0.03 5 -0.07 6 -0.25 7 -0.27 8 -0.30
I have tried the following:
fig = plt.figure(figsize=(15, 8)) ax = fig.add_subplot(1, 1, 1) res = np.genfromtxt(os.path.join(folder, 'residuals.csv'), delimiter=',', names=True) for name in res.dtype.names[1:]: plt.scatter(res.x, res.y, s=200, c=res.residual, cmap='jet')
That works fine but it only sorts my data by color. The size is the same and I can’t tell which are negative/positive values, so that’s why I’m looking for those two conditions previously mentioned.
Any help is very appreciated!
Advertisement
Answer
Seaborn’s scatterplot allows both coloring and a size depending on variables. Here is how it could look like with your type of data.
import matplotlib.pyplot as plt from matplotlib.colors import TwoSlopeNorm import numpy as np import seaborn as sns res = np.array([0.15, 0.04, 0.02, 0.01, -0.03, -0.07, -0.25, -0.27, -0.30]) x = np.arange(len(res)) y = np.ones(len(res)) hue_norm = TwoSlopeNorm(vcenter=0) # to make sure the center color goes to zero ax = sns.scatterplot(x=x, y=y, hue=res, hue_norm=hue_norm, size=res, sizes=(100, 300), palette='Spectral') for xi, resi in zip(x, res): ax.text(xi, 1.1, f'{resi:.2f}', ha='center', va='bottom') ax.set_ylim(0.75, 2) ax.set_yticks([]) plt.show()