This tutorial will demonstrate how to take a single cell in a local Jupyter Notebook and run it in the cloud, using the Valohai platform and its command-line client (CLI).
Valohai now supports random search for hyperparameter optimization (which we call the Tasks feature), which has been proven in the aptly named paper Random search for hyper-parameter optimization to be an efficient way to find “neighborhoods” of likely-to-be-optimal hyperparameter values, which can then be iterated further to find the really good values.
Since the rise of the deep learning revolution, springboarded by the Krizhevsky et al. 2012 ImageNet victory, people have thought that data, processing power and data scientists were the three key ingredients to building AI solutions. The companies with the largest datasets, the most GPUs to train neural networks on, and the smartest data scientists were going to dominate forever.
Watch a recording of the webinar on version control in machine learning that was held on 22th of November 2018. During the webinar we discussed about the topics below and answered multiple questions addressed by the attendees.
PocketFlow is an open-source framework from Tencent to automatically compress and optimize deep learning models. Especially edge devices such as mobile phones or IoT devices can be very limited on computing resources so sacrificing a bit of model performance for a much smaller memory footprint and lower computational requirements is a smart tradeoff.
Microsoft's Cognitive Toolkit or CNTK is an open source framework for building Deep Learning models. This relatively new framework has been gaining traction so we decided to make sure Valohai supports it well. One of the benefits over competing frameworks has been CNTK’s ground up support for multi-node, multi-GPU training, something that for instance TensorFlow has been struggling to tackle well. If you are doing work on really large datasets, you should maybe give it a try.
Synthetic data is artificially created information rather than recorded from real-world events. A simple example would be generating a user profile for John Doe rather than using an actual user profile. This way you can theoretically generate vast amounts of training data for deep learning models and with infinite possibilities.
You might have heard that every individual subject to automated decision making by machine learning models has a right to an explanation of the result. I bet you feel drops of sweat forming on your forehead when you receive an inquiry from a manager saying that he needs details about how a certain decision was made. If thinking about this scenario gives you chills, you are in the right place. Read further and learn how to tackle the transparency issue.
When meeting with teams that are working with machine learning today, there is one point above everything else that I try to teach. It is the importance of storing and versioning of machine learning experiments and especially how many things there actually are that need to be stored.
You know what really grinds my gears? When I have a deep learning model that I want to train and I have to SSH into my AWS instance, install all the drivers and libraries, run my code and then forget to shut down my machine! Once, I ended up forgetting one up over the weekend that cost my employer over $10 000!!!