Valohai blog

Insights from the deep learning industry.

All Posts

Machine Learning Orchestration for Free


You know what really grinds my gears? When I have a deep learning model that I want to train and I have to SSH into my AWS instance, install all the drivers and libraries, run my code and then forget to shut down my machine! Once, I ended up forgetting one up over the weekend that cost my employer over $10 000!!!

That won’t be grinding my gears any more, because Valohai just launched a free tier!

With the Valohai platform you can pack up your code in a Docker container, send it into the cloud or an on-premise instance and just click go. Valohai then automatically starts your CPU/GPU farm, sets up the docker container, runs your pipelined code (e.g. extract -> transformation -> training -> inference), shows you the progress in real time on a dashboard and shuts down the machines once done.

The Valohai Deep Learning Management platform builds on five cornerstones:

  1. Help Data Scientists train models,
  2. Ensure reproducibility of all training sessions,
  3. Make every training transparent to the entire team,
  4. Enable quick onboarding of new team members to your projects and
  5. Help in collaboration between teams and projects.

This is accomplished by Valohai’s machine orchestration (in the cloud and on-premise), version control for every part of the training run (from data to hyperparameters and from code to costs), automation of pipelined steps in training runs, standardized workflows and team collaboration tools across projects and trainings.

The foremost reason for Data Scientists to start using Valohai is access to a grid of CPU/GPU instances at the click of a button. People have trained everything from autonomous ferry to recommendation engines on hundreds of terabytes of data. The reason why these teams have chosen Valohai has been that they can run it on hundreds of GPUs in parallel at the click of a button.

Today we’re happy to release this as a free-for-all tier

There is no trial period or strings attached, just create an account and get started. Import up your git training code (be it in any language or framework – Python, Perl, C, Java, TensorFlow, Caffe, Keras, Darknet, DL4J...), select a Docker image and click Run. You can choose to run it on AWS, GCP or Azure and in the commercial tiers even on your own hardware. After your free credits run out, you’ll only pay for the cloud prices per second.


Sign up now for the free tier or check out tier features.

Fredrik Rönnlund
Fredrik Rönnlund
Software Engineer turned marketing lizard turned product dadbod turned ML nerd. In charge of growth at Valohai, i.e. the co-operation between products, marketing and sales.

Related Posts

Machine Learning Infrastructure Lessons from Netflix

Ville Tuulos, machine learning infrastructure architect, was the first to publicly dissect Netflix’s Machine Learning infrastructure at QCon in November 2018 in San Francisco. If you haven’t seen the talk yet, read the summary of his talk here! All the pictures used here, are from Ville's presentation. The full talk is 49 minutes long and you can watch it in its entirety on YouTube. From a scattered toolset to a coherent machine learning platform Ville starts by comparing Machine Learning Infrastructure to an online store and how building one was truly a technical problem twenty years ago. Back then you needed to build the whole online shop yourself starting from setting up the servers because the cloud did not exist. New platforms and technologies have since emerged that allow basically anyone to build up an online store and nowadays it is more about knowing the customers than setting up the webshop.

Building Machine Learning Infrastructure at Netflix

In our series of machine learning infrastructure blog posts, we recently featured Uber’s Michelangelo. Today we’re happy to be interviewing Ville Tuulos from Netflix. Ville is a machine learning infrastructure architect at Netflix’s Los Gatos, CA office.

Build vs. Buy – A Scalable Machine Learning Infrastructure

In this blog post we’ll look at which parts a machine learning platform consists of and compare building your own infrastructure from scratch to buying a ready-made service that does everything for you.