The cell level Cmd + K shortcut only works on a given cell to create or edit code and fix errors. Just tested it and it generates markdown well (just start your prompt with "this is a markdown cell")
In the sidebar/chat window, it should be trivial to not parse the markdown and just show it raw. I'll work on it. In the main notebook, it's a bit harder but we are planning to allow multi-cell insertions but it will probably take 2-3 weeks.
Yeah the golden goose for me personally is the ability to say "create a jupyter notebook about x topic" and have an LLM spit out interspersed markdown (w/ inline latex) and python cells. It would be really cool if the LLM was good at segmenting those chunks and drawing stuff/evaluating output at interesting points. Quick example to illustrate the idea:
I find Cursor to be extremely good right up to that point - I can work with Jupyter via the VS code extension and quickly get mixed markdown like how you're describing now - but it cannot do the multi-cell output or intelligent segmenting described above. I currently split it apart myself from the big 'ol block of markdown output.
This is something we've experimented with and I know some other tools out there claim to do this, I've just found that there's a very simple issue with this: if the AI gets any step wrong, every subsequent step is wrong and then you have to review every bit of code/markdown bit by bit, and it ends up turning into more work than just doing the analysis step by step while guiding the AI. I'm optimistic that this will change over time as the AI gets better, but it's still quite fragile (although it demos really well...)
So if you had 3 markdown cells and 3 python cells, I would design the tool to pull all the content out of those cells and present it (sans all that ipynb markup, just contents, probably in markdown) to the model as the full context for every edit you want to make. So the tool would need to know how to transform a given notebook into a collection of markdown/python cells which it would present to the model to make edits. The model would need to return updated cells in the same format, and the tool would update the cells in the document (or just replace them entire with new cells from the response). I would be fine with this just blowing away all previous evaluation results.
Do you think that approach would work? Not sure if I'm misunderstanding the issue you're describing and I recognize it is likely much messier than I imagine.
This is something we're planning on doing - just generate a large bit of text with markdown text and code in the middle. This is actually how the newer models already generate code - with the only difference being there's only one code block.
Via the use of <thinking></thinking> blocks, it's pretty straightforward to get the the model to evaluate it's own work and plan the next steps (basically chain of thought) but then you can filter out the <thinking> block in the final output.
The last trick to making this actually work is to give the AI model evaluation power - make it be able to run certain inspection code to evaluate its decisions so far and feel that evaluation to the next set of steps.
Combining all of this, it's very possible to convert an AI chat into a multi-step markdown + code notebook that actually works.
If you'd like to be updated when we have this feature in, please leave a comment on the issue. Alternatively, my email is in my bio - feel free to email me so that when we have this feature, we can send you an update!