We built an evaluation suite [1] and have some benchmarks [2] that we presented at some ICML workshops in 2016 comparing against some standard and open source methods. Members of our research team built or contributed to many of the popular open source libraries in Bayesian optimization (MOE, Bayesopt, Hyperopt, etc). In many of our other examples we benchmark against techniques we typically encounter in practice with our customers (grid and random search) [3] [4]. If you have particular problems you are interested in I'd be happy to connect you with members of the team that can explore a free POC to benchmark against your models [5].