Skip to content
Advertisement

How to create a boxplot not showing the outliers using Python and Plotly?

How to create a boxplot not showing the outliers using Python and Plotly?

I have a full list of points I use to create a box plot that has many outliers and the range is too big for a comparable box plot.

I just don’t want to show the outliers in this list on the box plot at all.

  1. Is there a way to not show outliers in the box plot?

If not, then I tried removing the outliers from data before plotting it. However, then Plotly makes some of points that I did not remove as outliers.

  1. Is there a way to create a box plot where none of the elements are considered outliers?

Advertisement

Answer

Andrew from Plotly here.

  1. You can’t just not show some of the data in the array. So you can set boxpoints: "all" to get a jitter of the points, including the outliers. This will leave the boxplot as-is, without outliers sitting on top of it. I’m guessing this isn’t really what you want though.

  2. To prevent outliers from being discovered in the data array, set boxpoints: false

So in Python, something like this should work:

import plotly.plotly as py
from plotly.graph_objs import Box, Figure

fig = Figure()
boxpoints_default = Box(y=[1, 2, 3, 2, 1, 10], name='default')
boxpoints_false = Box(y=[1, 2, 3, 2, 1, 10], boxpoints=False, name='no outliers')
boxpoints_all = Box(y=[1, 2, 3, 2, 1, 10], boxpoints='all', name='jitter boxpoints')

fig['data'].extend([boxpoints_default, boxpoints_false, boxpoints_all])
fig['layout'].update(title='Comparing boxplot "boxpoints" settings')

py.iplot(fig, filename='Stack Overflow 31497537')

Here’s the resulting figure for that:

https://plot.ly/~theengineear/4936/comparing-boxplot-boxpoints-settings/

Here’s a link to box plot tutorials in general with Plotly:

http://help.plot.ly/make-a-box-plot/

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement