I have a big dataframe which includes 30 samples, measured one every 6 sec over days. It looks something like this:
DATE_TIME | SAMPLE | VALUE |
---|---|---|
2020-12-10 10:52:48 | 1 | 3.22 |
2020-12-10 10:52:54 | 2 | 2.93 |
2020-12-10 10:53:00 | 3 | 2.27 |
… | … | … |
2020-12-10 16:27:13 | 1 | 1.66 |
2020-12-10 16:27:19 | 2 | 1.15 |
2020-12-10 16:27:25 | 3 | 1.23 |
I want to plot the time series for each individual sample (multiple line chart). I tried:
JavaScript
x
10
10
1
import pandas as pd
2
import numpy as np
3
import matplotlib.pyplot as plt
4
all_data = pd.read_csv("data.csv")
5
6
time_df=pd.DataFrame({'x':all_data['DATE_TIME'],'y1':all_data['SAMPLE']==1,'y2':all_data['SAMPLE']==2})
7
plt.plot('x','y1', data=time_df, marker= 'o',markerfacecolor='blue', markersize=1, color='skyblue', linewidth=4)
8
plt.plot('x','y2', data=time_df, marker= 'o',markerfacecolor='green', markersize=1, color='skyblue', linewidth=4)
9
plt.show()
10
But it’s not working, I get a strange figure:
I also tried making individual dataframes for the samples and it works but I’m sure there must be a more efficient way to do this.
JavaScript
1
8
1
SAMPLE1_df=all_data.loc[all_data["SAMPLE"] == 1]
2
SAMPLE2_df_df=all_data.loc[all_data["SAMPLE"] == 2]
3
4
fig = go.Figure()
5
fig.add_trace(go.Scatter(x=SAMPLE1_df_df["DATE_TIME"], y=SAMPLE1_df["VALUE"], mode='lines', name= "SAMPLE1"))
6
fig.add_trace(go.Scatter(x=SAMPLE2_df_df["DATE_TIME"], y=SAMPLE2_df["VALUE"], mode='lines', name= "SAMPLE2"))
7
fig.show()
8
Advertisement
Answer
If you have plot a dataframe with several columns, you get the desired result. You can transform your dataframe to such by groupby
or set_index
:
JavaScript
1
2
1
all_data.groupby(["DATE_TIME", "SAMPLE"])["VALUE"].mean().unstack("SAMPLE").interpolate(method='linear').plot()
2
or, if you do not have duplicates
JavaScript
1
2
1
all_data.set_index(["DATE_TIME", "SAMPLE"])["VALUE"].unstack("SAMPLE").interpolate(method='linear').plot()
2