how to do hyperparameter optimization in large data?

Question

I almost finished my time series model, collected enough data and now I am stuck at hyperparameter optimization. And after lots of googling I found new & good library called ultraopt, but problem is that how much amount of fragment of data should I use from my total data (~150 GB) for hyperparameter tuning. And I want to try lots of

Accepted Answer

Hyperparameter tuning is typically done on the validation set of a train-val-test split, where each split will have something along the lines of 70%, 10%, and 20% of the entire dataset respectively. As a baseline, random search can be used while Bayesian optimization with Gaussian processes has been shown to be more compute efficient. scikit-optimize is a good package for this.

Advertisement

Answer