Developing a machine learning model for a new project starts with certain common groundwork and exploration, to understand your data and figure out the approaches to try. A popular choice for this groundwork is Jupyter, an environment where you write Python code interactively. In Jupyter notebook's cells you can evaluate and revise and it is an attractive, visual choice (and many times the right choice) – for this step of data science work. Since Jupyter kernels, the processes backing a notebook’s execution, retain their internal state while the code is being edited and revised, they’re a highly interactive, fast-feedback environment.
However, while convenient, Jupyter notebooks can be hard to reason about exactly because of this retention of state, since the state of your environment may have changed in a non-linear fashion (or worse yet, left in an inconsistent state) after re-evaluation of an earlier cell. It's entirely possible to have a saved notebook that can't be successfully evaluated after relaunching the kernel. Since production-grade code is supposed to be easily tested and reviewed, as we’ve learned as an industry, this isn’t desirable at all.
It can also be difficult to keep track of the exact versions of dependencies you've used during development. For instance, a model that worked fine with a certain version of TensorFlow might not run at all with a newer one down the line, and it's tedious to try and figure out what exactly was being run at the time of exploration.
Jupyter Notebook in Production. Or not..
Let's assume you've played nice and been fastidious enough to not run into these problems – all your dependencies are locked down and you've found out that you can actually run your notebook non-interactively with
jupyter nbconvert --to notebook --execute notebook.ipynb and maybe pipe the output into a file, for tracking results. You'll inevitably want to run your training code using different parameters (say, learning rates, network structures, etc.);
jupyter nbconvert --execute isn't really conducive for that, and editing the notebook, or maybe a separate configuration file, to change constants is just silly, too.
These are some of the reasons why we advocate for using regular, linear Python scripts instead of Jupyter/IPython notebooks when going from initial exploration work to something resembling production. Another thing is that you get to develop using your favorite editor/IDE, be it vim or Emacs or VSCode or PyCharm (which, by the way, has an excellent Scientific Mode – and support for notebooks too), instead of being confined to a browser. Switching from notebooks to regular code also lets you refactor your solution to a more modular, more easily testable and reviewable package.
Of course there are drawbacks; development is less interactive, and since there is no persistent state, everything needs to be evaluated or loaded from scratch at every invocation of your script. On the other hand this property makes you think about preprocessing your data to a faster-to-load format earlier. When you extract the preprocessing code from other code, it becomes easier to maintain and more reproducible, as well.
*image of stairs is derived from https://unsplash.com/photos/pKvmGR4qHrg