As many already said, I think it is a great tool for prototyping and data explor...

jre · on Aug 25, 2018

I'm not sure exactly what Netflix means by "if the job breaks", but I've used jupyter notebook in production to do ML before.

If you consider the notebook as a way to augment your logs with plots, it might make a bit more sense.

Running a jupyter notebook is a nice way to generate a HTML report for a job. For a typical ML pipeline, you first plot some stats about the input data, then train some model, plot some training loss, a confusion matrix, some example of predictions, etc...

If some job gives a strange result (maybe that's what they mean by break), having the notebook rendered as an HTML page with all the plot is a very effective way to do a first round of diagnostics. You can also start the notebook with the same parameters and 'run' through your report, which is a nice way to do interactive debugging.

Also in this case, the notebook itself was quite small in terms of lines of code. All the functions were implemented in modules, so it's really like the notebook is your 20 lines 'main' function. So you need some discipline among your team.