Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As many already said, I think it is a great tool for prototyping and data exploration but when it comes to moving code to production, for me it makes very little sense to use it.

Netflix said that if the job breaks they can enter the notebook with the data and see what is wrong. For me it feels like they did development with 0 safe guards and if it breaks they check why. Instead of logging problems and dealing with edge cases in the code beforehand



I'm not sure exactly what Netflix means by "if the job breaks", but I've used jupyter notebook in production to do ML before.

If you consider the notebook as a way to augment your logs with plots, it might make a bit more sense.

Running a jupyter notebook is a nice way to generate a HTML report for a job. For a typical ML pipeline, you first plot some stats about the input data, then train some model, plot some training loss, a confusion matrix, some example of predictions, etc...

If some job gives a strange result (maybe that's what they mean by break), having the notebook rendered as an HTML page with all the plot is a very effective way to do a first round of diagnostics. You can also start the notebook with the same parameters and 'run' through your report, which is a nice way to do interactive debugging.

Also in this case, the notebook itself was quite small in terms of lines of code. All the functions were implemented in modules, so it's really like the notebook is your 20 lines 'main' function. So you need some discipline among your team.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: