This code manually selects a column from the y table and then joins it to the X table. The program then performs linear regression. Any idea how to do this for every single column from the y table?
yDF = pd.read_csv('ytable.csv') yDF.drop('Dates', axis = 1, inplace = True) XDF = pd.read_csv('Xtable.csv') ycolumnDF = yDF.iloc[:,0].to_frame() regressionDF = pd.concat([XDF,ycolumnDF], axis=1) X = regressionDF.iloc[:,1:20] y = regressionDF.iloc[:,20:].squeeze() lm = linear_model.LinearRegression() lm.fit(X,y) cf = lm.coef_ print(cf)
Advertisement
Answer
You can regress multiple y’s on the same X’s at the same time. Something like this should work
import numpy as np from sklearn.linear_model import LinearRegression df_X = pd.DataFrame(columns = ['x1','x2','x3'], data = np.random.normal(size = (10,3))) df_y = pd.DataFrame(columns = ['y1','y2'], data = np.random.normal(size = (10,2))) X = df_X.iloc[:,:] y = df_y.iloc[:,:] lm = LinearRegression().fit(X,y) print(lm.coef_)
produces
[[ 0.16115884 0.08471495 0.39169592] [-0.51929011 0.29160846 -0.62106353]]
The first row here ([ 0.16115884 0.08471495 0.39169592]
) are the regression coefs of y1
on xs and the second are the regression coefs of y2
on xs.