I am doing some Logistics Regression homework. I just wonder if in any case, the evaluation metrics for the test set are a bit better than the training set (like my results below)? And if yes, what gap is allowed? Below is my evaluation result for the test set and training set, given that both sets are extracted from the
Tag: logistic-regression
logistic regression and GridSearchCV using python sklearn
I am trying code from this page. I ran up to the part LR (tf-idf) and got the similar results After that I decided to try GridSearchCV. My questions below: 1) Then I calculated f1 score manually. why it is not matching? If I try scoring=’precision’ why does it give below error? I am not clear mainly because I have
Cannot get L1 ratio in LogisticRegressionCV object
I am trying to fit an elastic net model using LogisticRegressionCV. I want to see what L1 ratio LogisticRegressionCV chooses after cross validation. I read from its documentation that after fitting we can access it by its attribute l1_ratio_. But when I tried this, it failed. The code is: It returns : AttributeError: ‘LogisticRegressionCV’ object has no attribute ‘l1_ratio_’ Sklearn
Do Machine Learning Algorithms read data top-down or bottom up?
I’m new to Machine Learning and I’m a bit confused about how data is being read for the training/testing process. Assuming my data works with date and I want the model to read the later dates first before getting to the newer dates, the data is saved in the form of earliest date on line 1 and line n has
beta coefficients and p-value with l Logistic Regression in Python
I would like to perform a simple logistic regression (1 dependent, 1 independent variable) in python. All of the documentation I see about logistic regressions in python is for using it to develop a predictive model. I would like to use it more from the statistics side. How do I find the Odds ratio, p-value, and confidence interval of a
Can i have too many features in a logistic regression?
I’m building a model to predict pedestrian casualties on the streets of New York, from a data set of 1.7 million records. I decided to build dummy features out of the ON STREET NAME column, to see what predictive power that might provide. With that, I have approximately 7500 features. I tried running that, and I immediately get an alert
Interpreting logistic regression feature coefficient values in sklearn
I have fit a logistic regression model to my data. Imagine, I have four features: 1) which condition the participant received, 2) whether the participant had any prior knowledge/background about the phenomenon tested (binary response in post-experimental questionnaire), 3) time spent on the experimental task, and 4) participant age. I am trying to predict whether participants ultimately chose option A
Logistic Regression Gradient Descent [closed]
Closed. This question needs debugging details. It is not currently accepting answers. Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question. Closed 1 year ago. Improve this question I have to do Logistic regression using batch gradient descent. The way I
Python and SPSS giving different output for Logistic Regression
Code: Here’s the dataset Result: Now I added the same data in spss.Analyse->Regression->Binary Logistic Regression. I set the corresponding Y -> dependent and XT -> Covariates. The results weren’t even close. Am I missing something in python or SPSS? Python-Sklearn Answer SPSS Logistic regression does not include parameter regularisation in it’s cost function, it just does ‘raw’ logistic regression. In
How to predict new values using statsmodels.formula.api (python)
I trained the logistic model using the following, from breast cancer data and ONLY using one feature ‘mean_area’ There is a built in predict method in the trained model. However that gives the predicted values of all the training samples. As follows Suppose I want the prediction for a new value say 30 How do I used the trained model