Hacker News new | past | comments | ask | show | jobs | submit login

Yes, difficulty of finding right hyperparameters is often overlooked. And it is a very frustrating part of creating a model. And methods like grid search just don't work, because of number of parameters to tune and time to train a network.



Actually, random search works a lot better than grid search for hyperparameter optimization. Usually, only a small number of hyperparameters actually matter, the trick is figuring out which ones. Grid search wastes time on irrelevant dimensions.

That said, any sort of hyperparameter optimization is extremely computationally intensive so random search is far from a panacea.


So when you search randomly and reach up to a set of optimised parameters, how do you know if it can't be optimised any further, since you haven't looked up all possible sets like in a grid?


You generally don't know if you've reached a suitable maxima, which is why it is good to run a nondeterministic optimizer a few times (if computation power allows) and see if there are any reliable parameters form there.

There are also somewhat better-than-random strategies such as Bayesian optimization and particle swarm optimization that can help you to search more efficiently.


Grid search never exhausts the search space either, at least if the dimensions are continuous.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: