While a lot of Bayesian optimization methods use GPs (MOE, Spearmint, BayesOpt, etc) some use TPEs as well (most notably, hyperopt [0]), and some ensemble these methods and others like SigOpt (YC W15) [1] (disclaimer: I'm one of the founders).
I tried to briefly go over the functional interpretation of GPs in this talk [2], although the book by Rasmussen and Williams does a much more thorough job [3] (free online, check out chapter 2 for this approach).
I'm happy to answer any questions about the differences. If you're a student/academic SigOpt is also completely free [4].
For graduate students out there that would rather be doing research than "graduate student gradient descent" (or, high dimensional, non-convex optimization in your head), SigOpt (YC W15) is a SaaS Optimization platform that is completely free for academic research [0]. Hundreds of researchers around the world use it for their projects.
Disclaimer: I co-founded SigOpt and wasted way too much of my PhD on "graduate student gradient descent"
SigOpt (YC W15) has an academic program that allows full access to their optimization platform for academic use [1]. Hundreds of academics around the world have used it and there are already dozens of papers that have been published that have benefited from it [2].
These techniques can be orders of magnitude more efficient than a standard hyperparameter search and really cut down the barrier to entry for these types of results.
It would be interesting to see what would happen if you also tried to tune the ensemble towards a specific task in the same way that you could tune a single model.
We've definitely seen that tuning the embedding hyperparameters (along with the others) can have a significant impact on performance. [1]
Additionally, whenever you open up the space of tunable parameters to include the embeddings or feature representations themselves you can usually significantly outperform just a well tuned classifier. [2]
This model seems like it trades off complexity in tuning for complexity of an ensemble, but I wonder what would happen if you tried to have your cake and eat it too and just tuned everything.
General employee lockup [1] ends at 6 months usually. A tank right before/at that time could signal what the insiders think and the long term prospects in general.
Hi, I'm Scott Clark, co-founder of SigOpt (YC W15). We provide hyperparameter optimization as a service.
We have some references to recent articles we've presented at NIPS, ICML, and AISTATS here [1]
We also have a higher level technical blog here [2] (we recently did a series of post on how uncertainty influences optimization in single and multi-metric cases). We've also done some hyperparameter tuning blog posts with our partners AWS [3] and NVIDIA [4].
All of these papers and blogs also contain references to other papers and blogs that can help start you down your literature review. Hopefully this is helpful!
Hi, I'm one of the founders of SigOpt (YC W15). This is part 3 of a 3 part series we've done on uncertainty in modeling and optimization (Part 1 and 2 here [0] and here [1]).
Let me know if you have any questions about this post or SigOpt in general. Javier is a Research Engineering Intern with us and wrote this post with our research team lead, Michael Mccourt. If you're a student looking for internships please check out our careers page [2]. Our platform is also free for academics [3]. You can find more of our research (including NIPS, ICML, AISTATS, etc papers) here [4].
SigOpt (YC W15) is a related service that performs a superset of these features as a SaaS API (I am one of the co-founders).
We've been solving this problem for customers around the world for the last 4 years and have extended a lot of the original research I started at Yelp with MOE [1]. We employ an ensemble of optimization techniques and provide seamless integration with any pipeline. I'm happy to answer any questions about the product or technology.
We're completely free for academics [2] and publish research like the above at ICML, NIPS, AISTATS regularly [3].
Hi, I'm one of the co-authors of MOE (it was a fork of my PhD thesis).
We haven't been actively maintaining the open source version of MOE because we built it into a SaaS service 4 years ago as SigOpt (YC W15). Since then we've also had some of the authors of BayesOpt and HyperOpt work with us.
Let me know if you'd like to give it a shot. It is also completely free for academics [1]. If you'd like to see some of the extensions we've made to our original approaches as we've built out our ensemble of optimization algorithms check out our research page [2].
Thanks! I'm one of the co-founders of SigOpt (YC W15).
You hit the nail on the head. We've been trying to promote more sophisticated optimization approaches since the company formed 4 years ago and are happy to see firms like Google, Amazon, IBM, and SAS enter the space. We definitely feel like the tide of education lifts all boats. Literally everyone doing advanced modeling (ML, AI, simulation, etc) has this problem and we're happy to be the enterprise solution to firms around the world like you mentioned. We provide differentiated support and products from some of these methods via our hosted ensemble of techniques behind a standardized, robust API.
We're active contributors to the field as well via our peer reviewed research [1], sponsorships of academic conferences like NIPS, ICML, AISTATS, and free academic programs [2]. We're super happy to see more people interested in the field and are excited to see where it goes!
I tried to briefly go over the functional interpretation of GPs in this talk [2], although the book by Rasmussen and Williams does a much more thorough job [3] (free online, check out chapter 2 for this approach).
I'm happy to answer any questions about the differences. If you're a student/academic SigOpt is also completely free [4].
[0]: http://hyperopt.github.io/hyperopt/
[1]: https://sigopt.com/research/
[2]: https://www.youtube.com/watch?v=J6UcAdH54RE&list=PLbSwfqjMfj...
[3]: http://www.gaussianprocess.org/gpml/chapters/RW.pdf
[4]: https://sigopt.com/solution/for-academia