Sounds like they fixed a bunch of things that are broken about Jupyter notebooks...

notagoodidea · on Aug 28, 2020

As sibling says, it is a very good way to explore data and also iterate on design for script and document them.

Two examples from a previous work experience (remote sensing) :

(1) A colleague where creating SSH tunnel to create and explore data with the Jupyter process was on the calculation server. He was able to launch heavy calculation, fast-feedback loop for satellite images and shapefiles, manipulate the results and write the explanation next to each cell. As you would use a real notebook in fact. I had the same workflow and when I realized that the script will be used more than once, I moved the code to python script with command-line arguments support (just plug-in `argparse` to the script) and moved the text to comment the script.

(2) Teaching, we held seminar about different API and tools and used jupyter notebooks to teach everyone. The fast feedback loop was essential for anything with figures, plots, images, etc.

Pluto.jl while not yet perfect for me address a lot of broken things that made using Jupyter notebook driving me crazy (I had to broke my cells in a way that I can rerun everything when needed to update the global space, it was aweful).

Tarq0n · on Aug 28, 2020

I understand the value of notebooks, but I'd rather use them in a dedicated environment, like VScode or Rstudio.

jonniedie · on Aug 28, 2020

FWIW, I thought I remembered hearing that the maintainers of the Julia extension in VSCode are working on getting embedded Pluto notebooks working.

Jugurtha · on Aug 28, 2020

>I still don't understand why anyone would want to do work in their browser though.

One of our colleagues had around thirty students who had to prepare their final year pojects in machine learning. We deployed our internal platform and gave them access so they wouldn't lose an academic year, as they were in a zone that was hit really hard with COVID-19.

They are mostly on Windows, are not comfortable with the CLI, have never used Git, they don't do Docker, have poor connectivity [4kB/s], don't have access to powerful machines, found it difficult to handle dependencies, and needed to work on +200GB datasets. They also were split into groups of two or three, and needed to be able to share the work with each other, and with their supervisor (our colleague).

So, one reason to use a browser based solution is to de-couple the user's computer from the dependencies, internals, or infra the work happens on, and simplify collaboration. This, or the main tool you rely on does not play nice with other tools, even if you're proficient.

We started to push for remote work in late 2018, and we started really going after it in 2019 because commute was draining our colleagues' energy. It really bothered us to see them arrive at work completely washed out, or see them worry about transportation at the end of the day, so we made remote work a priority. But they mainly trained models with notebooks, and there was a need to be able to do actual work as a team, so we built the tooling around our workflow and we've had to add in missing features to accomodate our colleagues who needed the notebook.

calebkaiser · on Aug 28, 2020

TL;DR: Working in a notebook in a browser is a terrible way to write a program but a fantastic way to explore data.

Even though notebooks are a common cause of headache in my world (I work on ML deployment), I think they're an incredibly valuable tool, and the familiar, visual interface of the browser plays a big part.

It clicked for me when I took a statistical genetics class taught by a team member of Hail.is (open source genomic analysis library). Coming from a dev background, I found working in a browser to be a clunky, awful experience—until I saw the way my classmates, most of whom were scientists by focus, used it. My instinct is to think of code in terms of the architecture of a program, but for them, code blocks were like buttons on a calculator. The speed at which they could iterate, and their ability to jump around, really drove home the value of the browser interface.

Would I want to write an API in one? Absolutely not. But for tinkering with genomic data? They're ideal in many ways.

Jeremy Howard of fast.ai talks about this a lot: https://twitter.com/jeremyphoward/status/1072555920029376512

WolfOliver · on Aug 28, 2020

would be curious what Jupyter notebooks things are better with Pluto.jl, can you name some concrete points?

celrod · on Aug 28, 2020

1. Dependency graph for cells, letting it automatically rerun what's needed when you change one. This keeps everything up to date.

2. git-friendly.

WolfOliver · on Aug 28, 2020

I wonder how they are doing it? simply re-evaluating every cell?

ddragon · on Aug 28, 2020

It's mentioned in the video [1], it does static analysis of the code to create a graph of dependencies (for example which cell uses a variable defined by another cell), so when you update any cell it will find what cells are affected by the change (the downstream nodes on a directed acyclic graph) and only evals the code on them (instead of running everything). Julia is particularly good for those kind of code analysis since it's a very Lispy language.

It also does a trick of creating new modules to manipulate scope to make deleted variables/import/cells invisible (and therefore free to be garbage collected).

[1] https://youtu.be/IAF8DjrQSSk?t=596

jonniedie · on Aug 28, 2020

Reproducibility is a huge one for me. I’m often doing exploratory work that I’ll want to save in its current state and pick back up a few months (or even years) later. With Jupyter, this almost never works because I am constantly editing and running cells out of order as I’m exploring things. If I save at any given point, there is no guarantee that the notebook will be in the same state when I reopen it and re-run the cells.

With Pluto and other reactive notebooks, you have a guarantee that the code you see on the screen will produce the same results. So if you go back and edit cells out of order, save the notebook, then open it and re-run later, it will always be in the same state you left it in.

Oreb · on Aug 28, 2020

In addition to what celrod said: Jupyter doesn't have an equivalent of Pluto's @bind macro, does it?

JustFinishedBSG · on Aug 28, 2020

It's not as elegant but kinda exists

https://ipywidgets.readthedocs.io/en/latest/

Oreb · on Aug 28, 2020

Thank you! I hadn't seen that.