Skip to content
Advertisement

Using columns of a Numpy array to create a displot

I’m trying to create a displot where I see a histogram of three different variables (each one in a different column of a numpy array). I want each column to display as a different subplot in the facet grid, but I can’t seem to find a way to do this without turning my data into a dataframe. I have been trying to search for answers, but almost all examples online of multi-plot displots are for data structured as a data frame.

Advertisement

Answer

You could create a dataframe on-the-fly:

from matplotlib import pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd

data = np.random.randn(3, 100)
sns.displot(data=pd.DataFrame({"data": data.ravel(),
                               "column": np.repeat(np.arange(data.shape[0]), data.shape[1])}),
            x="data", col="column", kde=True, color='blueviolet', height=3)
plt.show()

displot from numpy array

Here is an alternative approach which first creates a dataframe directly from the numpy array (pandas will call the columns 0, 1, 2, 3, 4). melt then creates the long form.

data = np.random.randn(200, 5).cumsum(axis=0)
df = pd.DataFrame(data).melt(var_name='column', value_name='data')
sns.displot(data=df,
            x="data", col="column", kde=True, color='crimson', height=3,
            facet_kws={"sharey": False, "sharex": False})

dataframe from numpy array, then melted for displot

PS: In case of a shape (5, 200) array, transposing the array (pd.DataFrame(data.T)) would create a similar dataframe.

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement