Why do sometimes CNN models predict just one class out of all others?

Question

I am relatively new to the deep learning landscape, so please don&#8217;t be as mean as Reddit! It seems like a general question so I won&#8217;t be giving my code here as it doesn&#8217;t seem necessary (if it is, here&#8217;s the link to colab) A bit about the data: You can find the original data here. It i…

Accepted Answer

I was about to comment:A more rigorous approach would be to start measuring your dataset balance: how many images of each class do you have? This will likely give an answer to your question.But couldn&#8217;t help myself look at the link you gave. Kaggle already gives you an overview of the dataset:Quick calculation: 25,812 / 35,126 * 100 = 73%. That&#8217;s interesting, you said you had an accuracy of 74%. Your model is learning on an inbalanced dataset, with the first class being over represented, 25k/35k is enormous. My hypothesis is that your model keeps predicting the first class which means that on average you&#8217;ll end up with an accuracy of 74%.What you should do is balance your dataset. For example by only allowing 35,126 - 25,810 = 9,316 examples from the first class to appear during an epoch. Even better, balance your dataset over all classes such that each class will only appear n times each, per epoch.

Advertisement

Answer