In the great talk “I Don’t Like Notebooks” (video and slides), Joel Grus lays out numerous criticisms of Jupyter Notebooks, perhaps the most popular environment for doing data science. I found the talk instructive — when everyone thinks something is great, you need people who are willing to criticize it so we don’t become complacent. However, I think the problem isn’t the notebook itself, but how it’s used: like any other tool, the Jupyter Notebook can be (and is) frequently abused.
Thus, I would like to amend Grus’ title and state “I Don’t Like Messy, Untitled, Out-of-Order Notebooks With No Explanations or Comments.” The Jupyter Notebook was designed for literate programming — mixing code, text, results, figures, and explanations together into one seamless document. From what I’ve seen, this notion is often completely ignored resulting in awful notebooks flooding repositories on GitHub:
The problems are clear:
The Jupyter Notebook can be an incredibly useful device for learning, teaching, exploration, and communication (here is a good example). However, notebooks like the above fail on all these counts and it’s nearly impossible to debug someone else’s work or even figure out what they are trying to do when these problems appear. At the very least, anyone should be able to name a notebook something helpful, write a brief introduction, explanation, and conclusion, run the cells in order, and make sure there are no errors before posting the notebook to GitHub.
Rather than just complaining about the problem (it’s easy to be a critic but a lot harder to do something positive) I decided to see what could be done with Jupyter Notebook extensions. The result is an extension that on opening a new notebook automatically:
The benefits of this extension are that it changes the defaults. By default, the Jupyter Notebook has no markdown cells, is unnamed, and has no imports. We know that humans are notoriously bad at changing default settings so why not make the defaults encourage better practices? Think of the Setup extension as a nudge — one that gently pushes you to write better notebooks.
To use this extension:
[setup](https://github.com/WillKoehrsen/Data-Analysis/tree/master/setup)folder (it has 3 files)
pip show jupyter_contrib_nbextensionsto find where notebook extensions are installed. On my Windows machine (with anaconda) they are at
and on my mac (without anaconda) they are at:
4. Place the
setup folder in
nbextensions/ under the above path:
jupyter contrib nbextensions install to install the new extension
6. Run a Jupyter Notebook and enable
Setup on the
nbextensions tab (if you don’t see this tab, open a notebook and go to
edit > nbextensions config)
Now open a new notebook and you’re good to go! You can change the default template in
main.js (see my article on writing a Jupyter Notebook extension for more details on how to write your own). The default template and imports are relatively plain, but you can customize them to whatever you want.
If you open an old notebook, you won’t get the default template, but you will be prompted to change the name from
Untitled every time you run a cell:
Sometimes, a little bit of persistence is what you need to change your ways.
From now on, let’s strive to create better notebooks. It doesn’t take much extra effort and it pays off greatly as others (and your future self) will be able to learn from your notebooks or use the results to make better decisions. Here are a few simple rules for writing effective notebooks:
The Setup extension will not solve all notebook-related problems, but hopefully, the small nudges will encourage you to adopt better habits. It takes a while to build up best practices, but, once you have them down, they tend to stick. With a little bit of extra effort, we can make sure that the next talk someone gives about notebooks is: “I like effective Jupyter Notebooks.”