Valohai blog

Insights from the deep learning industry.

All Posts

Bayesian Hyperparameter Optimization with Valohai

Grid search and random search are the most well-known in hyperparameter tuning. They are also both first-class citizens inside the Valohai platform. You define your search space, hit go, and Valohai will start all your machines. It does a search over the designated area of parameters you’ve defined. It is all automatic and doesn’t make you launch or shut down machines by hand. Also, you don't accidentally leave machines running costing you money. But we’ve been missing one central way for hyperparameter tuning, Bayesian optimization. Not anymore!

Bayesian Hyperparameter Optimisation in Valohai

Bayesian optimization is good for problems where the target function f(x) is a black box, and it’s expensive to evaluate it. This means that you don’t want to call it too many times in vain, and you can't use gradient descent based approaches. Bayesian starts by guessing a function g(x) such that it would fulfill the unknown function f(x). It then uses gaussian probability to guess an x value, evaluates f(x) based on it and updates g(x). This is then repeated several times until we are happy with g(x). If that was all too mathematical, all you need to know is that it’s a smarter way to find submarines than just wild shots in the dark. If you want more details about the concepts behind the Bayesian optimization, check out this article by Will Koehrsen!

To conduct a Bayesian search in Valohai, you create a new Task and select Bayesian optimization. Configure the desired amount of executions, set the batch size, and a target metric (e.g., loss). Valohai will then optimize towards the set target value of the said target metric.

Valohai’s UI for starting a Bayesian optimizationValohai’s UI for starting a Bayesian optimization

After the optimization is completed, you’ll find something like this:

Valohai’s Bayesian optimization process uses Hyperopt-library’s Tree Parzen Estimator implementation to pick the parameter’s from the previous execution as input for the next.

Under the hood, the optimization works in the following way:

  1. Create a batch of startup executions using random search
  2. Based on these executions, create a simplified function to model the relationship between the hyperparameters and the target metric value (for example "loss")
  3. Based on this simplification of their relationship, find the optimal values for the hyperparameter to make the target metric as close to the target value as possible
  4. Run the next batch of executions and repeat the process from step 2.

 

Using interactive hyperparameter optimization, you can make hyperparameter tuning faster and more efficient than for example using a random search or an exhaustive grid search.

Get a demo and learn how to use the Valohai platform to do Bayesian hyperparameter optimization for your project. 

P.S. This blog post concentrates on Valohai’s capabilities in the Web UI. You can, however, use any external optimizer by hooking it up to Valohai’s API or CLI. Ask us more by signing up for a private demo!

Magdalena Stenius
Magdalena Stenius
Magda is a full-stack developer at Valohai and an organizer of several python meetups and coding clubs. She is passionate about open source, open data, and MLOps.

Related Posts

Bayesian Hyperparameter Optimization with Valohai

Grid search and random search are the most well-known in hyperparameter tuning. They are also both first-class citizens inside the Valohai platform. You define your search space, hit go, and Valohai will start all your machines. It does a search over the designated area of parameters you’ve defined. It is all automatic and doesn’t make you launch or shut down machines by hand. Also, you don't accidentally leave machines running costing you money. But we’ve been missing one central way for hyperparameter tuning, Bayesian optimization. Not anymore!

Announcing Valohai Pipelines

One of the more exciting things we have under development (or, should we say, in the pipeline) right now is our Pipeline system. Since our mission is to enable CI/CD style development for AI and machine learning, there's a logical next step up from just (well, "just" might be the understatement of the year here) running your code in a repeatable manner with Valohai.

Automatic Data Provenance for Your ML Pipeline

We all understand the importance of reproducibility of machine learning experiments. And we all understand that the basis for reproducibility is tracking every experiment, either manually in a spreadsheet or automatically through a platform such as Valohai. What you can’t track what you’ve done it’s impossible to remember what you did last week, not to mention last year. This complexity is further multiplied with every new team member that joins your company.