Skip to content
Advertisement

How to plot a trend line in a scatter plot from multiple dataframe data?

I have 4 data frames, which are df, df1, df2 and df3.I have plotted these data into a scatter plot.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

plt.figure()
x1 = df['x'].loc['2020-01-01 00:00':'2020-01-01 00:59']
y1 = df1['y']
plt.scatter(x1, y1)

x2 = df['x'].loc['2020-01-01 01:00':'2020-01-01 01:59']
y2 = df2['y']
plt.scatter(x2, y2)

x3 = df['x'].loc['2020-01-01 02:00':'2020-01-01 02:59']
y3 = df3['y']
plt.scatter(x3, y3)

To plot trend line, I use the below code found in stackoverflow.

 z = np.polyfit(x1, y1, 1)
    p = np.poly1d(z)
    plt.plot(x1,p(x1))
plt.show()

However, this only gives the trendline from df1. I would like to plot a trendline which includes the data from df1, df2 and df3.

Advertisement

Answer

It uses data only from df1, as you send only the df1 to the code which plots your trend.

something like:

x_full = x1.append(x2).append(x3)
y_full = y1.append(y2).append(y3)

z = np.polyfit(x_full, y_full , 1)
p = np.poly1d(z)
plt.plot(x_full, p(x_full))
plt.show()

but please check documentation on how you append series, i.e. y1,y2,y3 https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.append.html

Advertisement