Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> The Jupyter notebooks are version controlled like my .py modules.

That's interesting, could you give us some more details ? Last time I tried to put a Jupyter notebook under git it was as mess. Do you use another tool than git or have they made tools to help with version control ? Or is it just your workflow that helps, like emptying all cell results before saving ?



Our version control uses a monorepo, with non-branching workflows, development against the head version, and frequent commits/pushes to production.

For Jupyter notebooks, cell results get emptied before saving, unless declared public. That is to ensure data confidentiality, not really to help version control.

Of course, standard diff gets confused with ipynbs. You need a tool like nbdime (notebook diff and merge).


Thanks for sharing nbdime! Been looking for something like this for a while.


Not OP, but I can recommend the handy https://github.com/kynan/nbstripout which acts as a git filter which makes version control ignore cell outputs.

With that approach, though notebooks are clean they're still fairly poor for easily evaluating diffs between versions. If code review / diffs are more important than preserving the notebook, then you could use a post save hook to convert notebook input to a .py file and output to .html:

https://towardsdatascience.com/version-control-for-jupyter-n...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: