XGBoost Regressor cannot fit the model using string data

Question

I&#8217;m trying to use XGBoost to predict a one target (one attribute) dataframe. Below my code. I run it on Colab However, the following error is returned: if I change the last line to I get this error: What I&#8217;m doing wrong? any clue? Answer XGBoost cannot handle categorical variables, so they need to…

Accepted Answer

XGBoost cannot handle categorical variables, so they need to be encoded before passing to XGBoost model. There are many ways you can encode your varaibles according to the nature of the categorical variable. Since I believe that your string have some order so Label Encoding is suited for your categorical variables:Full code:import xgboost as xgbimport pandas as pdimport numpy as npfrom sklearn.metrics import mean_squared_errorfrom sklearn.preprocessing import LabelEncoderfrom sklearn.model_selection import train_test_splitdata = [['sp37n1sy1bmjc6yp3m7wqefpz' ], ['sp36vfqtjv87pvw68zdmhnvxb'], ['sp36y965ksqnmq0b0b58y1p00'], ['sp36y965ksqnmq0b0b58y1p00'], ['sp36y965ksqnmq0b0b58y1p00'], ['sp36y965ksqnmq0b0b58y1p00'], ['sp36vues2ed9r6s196dmv4p00'], ['sp36vvgq6rq9sq1gv0nt19h20'], ['sp36ypgx7jmmsuujz2ww81n20'], ['sp37n1w451m6wtp6h4eq0wjb0'], ['sp36y99s6w9jm3614ugt52bpz'], ['sp37n1mywgv57qsg5r7hp7bpz'], ['sp36y9fbfz4t9c5znp27z3pbp']]df = pd.DataFrame(data)X = df[:-1]y = df[1:]le = LabelEncoder()X = le.fit_transform(X)y = le.fit_transform(y)X = np.array(X).reshape(-1,1) #convert to 2DX_train, X_test, y_train, y_test = train_test_split(X, y)regressor = xgb.XGBRegressor(    n_estimators=100,    reg_lambda=1,    gamma=0,    max_depth=3)regressor.fit(X_train, y_train)y_pred = regressor.predict(X_test)y_predictions = [int(round(y,0)) for y in y_pred]print("Encoded Predictions",y_predictions) #encoded predictionsprint("String predictions",le.inverse_transform(y_predictions)) #original string predictionsprint()print("Encoded Actual value",y_test) #encodedprint("String Actual value",le.inverse_transform(y_test)) #original test values

Advertisement

Answer