Subplotting of Pandas.DataFrameGroupBy[group_name] does not yield expected results

This is a re-opening of my initial question with the same title which was closed as duplicate. As None of the suggested duplicates helped me to solve my problem, I post this question again.

I have a DataFrame with time series related to some devices which come from a hdf-file:

from matplotlib import pyplot as plt
import numpy as np
import pandas as pd
from pandas import DataFrame


def open_dataset(file_name: str, name: str, combined_frame: DataFrame):

data_set: DataFrame = pd.read_hdf(file_name, key=name)
data_set['name'] = name
combined_frame = pd.concat([combined_frame, data_set], axis=0)
return combined_frame


if __name__ == '__main__':

    names = ['YRT1IN1E', 'YRT1LE1', 'YRT1MH1', 'YR08DT1ML']

    working_frame = DataFrame()

    for name in names:
        working_frame = open_dataset('data.h5', name, working_frame)

    grouped_frame = working_frame.groupby('name')


    fig, axs = plt.subplots(figsize=(10, 5),
                        nrows=4, ncols=1,  # fix as above
                        gridspec_kw=dict(hspace=0), sharex=True)

    axs = grouped_frame.get_group('YR08DT1ML').rawsum.plot()
    axs = grouped_frame.get_group('YRT1LE1').voltage.plot()
    axs = grouped_frame.get_group('YRT1MH1').current.plot()
    axs = grouped_frame.get_group('YRT1IN1E').current.plot()

    plt.show()

This produces the following output:

What am I doing wrong? I would like to have each of the plots in it’s own row, not all in one row.

The data file “data.h5” is available at: Google Drive

What I tried from the suggested posts:

Answer by joris, Mar 18, 2014 at 15:45 causes code to go into infinite loop, data is never plotted:

fig, axs = plt.subplots(nrows=2, ncols=2)
grouped_frame.get_group('YR08DT1ML').rawsum.plot(ax=axs[0,0])
grouped_frame.get_group('YR...').rawsum.plot(ax=axs[0,1])
grouped_frame.get_group('YR...').rawsum.plot(ax=axs[1,0])
grouped_frame.get_group('YR...').rawsum.plot(ax=axs[1,1])

A variation is leading to same result as I described above:

axes[0,0] = grouped_frame.get_group('YR08DT1ML').rawsum.plot()
axes[0,1] = grouped_frame.get_group('YR...').rawsum.plot()
...

Infinite loop happens as well for sedeh’s, Jun 4, 2015 at 15:26 answer:

grouped_frame.get_group('YR08DT1ML').rawsum.plot(subplots=True, layout=(1,2))
...

Infinite loop happens as well for Justice_Lords, Mar 15, 2019 at 7:26 answer:

fig=plt.figure()
ax1=fig.add_subplot(4,1,1)
ax2=fig.add_subplot(4,1,2)
ax3=fig.add_subplot(4,1,3)
ax4=fig.add_subplot(4,1,4)

grouped_frame.get_group('YR08DT1ML').rawsum.plot(ax=ax1)
...

It seems to me that the problem is related to the fact that I plot with a pandas.DataFrameGroupBy and not a pandas.DataFrame

Answer

Seems like matplotlib was taking a long time to process the DatetimeIndex. Converting to a time and cleaning everything up did the trick:

names = ['YR08DT1ML', 'YRT1LE1', 'YRT1MH1', 'YRT1IN1E']
df = pd.concat([pd.read_hdf('data.h5', name) for name in names])

df.reset_index(inplace=True)
df.index = df['time'].dt.time
df.sort_index(inplace=True)

fig, axes = plt.subplots(figsize=(10, 5), nrows=4, ncols=1, gridspec_kw=dict(hspace=0), sharex=True)

cols = ['rawsum', 'voltage', 'current', 'current']

for ix, name in enumerate(names):
    df.loc[df['nomen'].eq(name), cols[ix]]
        .plot(ax=axes[ix])

plt.show();

Hope this helps.

Advertisement

Answer