Python : y should be a 1d array, got an array of shape {} instead. format(shape)

Question

The above is my code which I tried in Google Colab. But here it shows one error : This is error is shown in the line Please help me to solve this error. I am a beginner so answer the question with elaboration Answer Your problem is that the outputs of train_test_split are ordered differently than you think. train_test_split returns

Accepted Answer

Your problem is that the outputs of train_test_split are ordered differently than you think.train_test_split returns the split of the first argument first, then the split of the second argument. So instead you should use it likex_train, x_test, y_test, y_test = train_test_split(x,y,test_size=0.5,random_state=0)You can find more information and a few examples in the documentation.You can resolve issues like that by inspecting the shapes of the values of your variables. Either use a debugger or print their shapes:import numpy as npimport pandas as pdfrom sklearn.model_selection import train_test_splitfrom sklearn.naive_bayes import GaussianNBdata = np.random.rand(100, 5)  # some test datadf = pd.DataFrame(data)x = df.values[:, :-1]  # you probably don't want to include the last column here?y = dfvalues[:, -1]  # does the same as df.shape[1]-1print(f"x shape: {x.shape}")  # (100, 4)print(f"y shape: {y.shape}")  # (100,)  ==> 1d, finex_train, y_train, x_test, y_test = train_test_split(x,y,test_size=0.5,random_state=0)print(f"x_train shape: {x_train.shape}")  # (50, 4)print(f"y_train shape: {y_train.shape}")  # (50, 4)  ==> 2d, so something is wrongprint(f"x_test shape: {x_test.shape}")  # (50,) => also badprint(f"x_test shape: {y_test.shape}")  # (50,) => also badgnb=GaussianNB()y_pred=gnb.fit(x_train,y_train).predict(x_test)  # error y should be 1d ...Now you can see why the error is raised and you can see where things go wrong. Then you can lookup the documentation of the last command that produced unexpected outputs.

Advertisement

Answer