Skip to content
Advertisement

Use Bokeh RadioGroup to plot selected subset of Pandas DataFrame within Jupyter

Goal

Plot subsets of rows in a Pandas DataFrame by selecting a specific value of a column.

Ideally plot it in jupyter notebook.

What I did

I have minimal knowledge of Javascript, so I have managed to plot by running Bokeh server with everything written in Python.

However, I couldn’t make it in Jupyter notebook with a Javascript callback. My way of doing it sounds a bit stupid: splitting the DataFrame into subsets by values of a column and putting them into a dict, then I can select a subset by the active selection from a RadioGroup.

This is my code example:


import pandas as pd
import bokeh
from bokeh.io import output_notebook, show
import bokeh.plotting as bp
import bokeh.models as bm
from bokeh.layouts import column, row

data = {
    'Datetime': ['2020-04-09T10:23:38Z', '2020-04-09T22:23:38Z','2020-04-09T23:23:38Z', '2020-01-09T10:23:38Z', '2020-01-09T22:23:38Z', '2020-01-09T23:23:38Z'],
    'Month': ['Apr', 'Apr', 'Apr', 'Jan', 'Jan', 'Jan'],
    'Values': [1.2, 1.3, 1.5, 1.1, 3, 1.3]
}

df = pd.DataFrame.from_dict(data)
month_list = df['Month'].unique().tolist()

plot_height = 600
plot_width = 1000
col2plot = 'Values'

month_dict = {}
for m in month_list:
    subset = df[df['Month'] == m].reset_index(drop=True)
    month_dict[m] = subset[['Datetime', col2plot]].to_dict()

p1 = bp.figure(
    plot_height=plot_height, 
    plot_width=plot_width, 
    title='Values',
    toolbar_location=None,
    tools="hover",
    tooltips=[("DateTime", "@Datetime")]
)

src = bm.ColumnDataSource(df[df['Month'] == 'Jan'].reset_index(drop=True))
p1.line(x='index', y=col2plot, alpha=0.8, source=src)

month_selector = bm.widgets.RadioGroup(labels=month_list, active=1)

jscode = """
var month = cb_obj.labels[cb_obj.active] //selected month
const new_data = source[month]
src.data = new_data
src.change.omit()
"""
callback = bm.CustomJS(args=dict(src=src, source=month_dict), code=jscode)
month_selector.js_on_change('active', callback)
output_notebook()
show(row(p1, month_selector))

The code runs but by selecting a certain month, the plot isn’t updating. This is probably due to the bad handling of the JS callback, any ideas for fixing this? Thanks a lot for your help!

Advertisement

Answer

Issues with your code:

  • In p.line, you’re using the index column. But when you call pd.DataFrame.to_dict(), the column is not there. Can be fixed by adding yet another .reset_index() before .to_dict()
  • to_dict() returns data in the form of a dict of dicts, but ColumnDataSource needs a dict of lists. Replace the call withto_dict('list')
  • src.change.omit() – a typo here, it should be emit. But since you’re replacing the whole data attribute instead of just changing some of the data, you can simply remove the line altogether
Advertisement