Hacker News new | past | comments | ask | show | jobs | submit login
[dupe] Continue will generate, refactor, and explain entire sections of code (continue.dev)
102 points by pyinstallwoes on Dec 18, 2023 | hide | past | favorite | 72 comments



This has been covered here before[0]. And I remember the questionable approach to add some of its dependencies[1]. I like meilisearch and used it in one project, but downloading some binary at runtime isn't really trustworthy IMHO.

[0]: https://news.ycombinator.com/item?id=36882146

[1]: https://github.com/continuedev/continue/blob/33a0436193aa65a...


The download path is hardcoded to be the binary release of a github project.

I don't know any other, more trustful way to download MeiliSearch rather than doing this.

You could download the source, and build it from scratch, but that also requires trust in the same exact Github repo which build the binary.

If your threat model is that Github repo being compromised, you can't escape that either.


The windows version is fixed, the posix version is depending on whatever meilisearch decides. A digital signature check would be great for supply chain security, eg. with cosign[0]. meilisearch prob won't make any effort there, since they want customers on their cloud offering.

[0]: https://docs.sigstore.dev/signing/signing_with_blobs/


My experience with these AI coding tools so far has been that while they're great for writing functions and explaining hard to understand errors, they’re really let down by the fact they don’t have high level context of the project or business you’re working in.

It seems it’s very hard to make good decisions about how to higher level things (modules, classes, dependency hierarchies etc.) without that context, and the programmer is forced to give the tool very specific instructions in order to compensate. At some point the instructions just need to be specific enough that you might as well be writing code again.

I have no doubt they'll get there eventually, but it seems like being able to write entire projects effectively might coincide with the arrival of true AGI. There's just so much context to consider.


> they're great for writing functions and explaining hard to understand errors

Are they? I'd be interested to hear your experience on this. So far for me they have only really been able to summarise what I could find from the top few results searching online. They do a good job of summarising that, and might be quicker, but that's been it.

However, when I encounter an actually tricky issue, like a threading bug or a null pointer exception/type error sort of generic issue that's 5 levels removed from its source, these tools never manage it. Despite prompting saying I don't need a NullPointerException explained to me, figure out how this is null, the results are poor.

This might be my biases speaking, but it really does feel like I'm speaking to something that's good at transforming words and paragraphs into different formats but which has no actual understanding of the code.


I've had success with asking "what are some possible bugs with this block of code".. Sometimes it spots errors that I wouldn't have thought of or at least gives some ideas for things to check.

Fixing tricky bugs often requires collecting additional information - stepping through code, looking at values of variables and making sure they are what you expect, etc. It's an iterative process and AI tools would need to be able to do the same thing - most humans wouldn't be able to solve errors like that just by looking at the code, and neither would an LLM.

I see the same issues with people who think AI is going to make scientific discoveries - it can't do that because making discoveries requires collecting data until something is certain or we have a clear picture. At that point, you don't need AI. AI won't be making discoveries until we can automate that entire process of forming a hypothesis, testing it / doing experiments, collecting data, refining your hypothesis, etc.


> So far for me they have only really been able to summarise what I could find from the top few results searching online

This has been my experience, except that the chat interface gets me to exactly the answer to my question considerably faster than a search engine.

I see these tools as search engines with much better user interfaces and customised responses.


Yeah, I guess that there's just not a lot of value for me in "what does this error message mean" because once you've worked with a language or framework for any reasonable amount of time you learn them and they sort of disappear into the background. LLMs do seem good for learning new systems though.


I just throw all relevant source code files (entirely) on it, paste the error message and it usually shows me what's going on. Or at least it utters a hunch, which is a good next step to find the error.


But where's the limit? Say I have a really large project and the function in question is calling dozens of functions from all over the place. The time it takes me to follow each call chain and copy all the code probably takes longer than just debugging the problem myself.

I was hoping the GitHub or intelliJ integration of copilot would automate this, especially the latter has excellent static analysis of your code and could automatically provide the AI with relevant context, but they just don't.

Even when asking it to just annotate a function and specifically ask it to document any special cases or oddities it might have, I never got much more than e.g. updateFroobCache() annotated with "updates the froob cache". Wow thanks.


> The time it takes me to follow each call chain and copy all the code probably takes longer than just debugging the problem myself.

Yes, and eventually, one of us who is doing this will get tired of it enough to automate the process. May even earn them a few bucks.


Start high level then loop it for additional context with each one being a more compressed summary? Symbols, then introspection and expand/compress until there's enough context within the window that is summarized for intelligence.


These tools are quite good when you need to write code in a language/framework you are not really familiar with. At the very least for scaffolding it saves a significant amount of time.


I think these situations are a bit paradoxal. If you don’t know the language, you can’t tell if it’s actually a good solution in that language, and I’ve seen so many bad answers I’d be concerned if I wasn’t familiar with the language.


I feel the opposite. I seldom ask them for anything directly but they are amazing at understanding the context and autocomplete highly app-specific code I was about to write anyway.


> they’re really let down by the fact they don’t have high-level context of the project or business you’re working in.

Currently, you need to treat your LLM like it is a junior programmer. AI-coding tools and junior programmers will not give you the code you want if you don't write a detailed prompt. In my experience, however, AI coding tools will provide you with something closer to what you want than a junior programmer would.


I am also very disappointed that it will generate outdated code, like JS code that uses var or code that looks like 10+ years old because it does not use recent Array,String or DOM APIs, I can tal it to rewrite it but imagine all the newbs that will use this tools and use outdated code and APIs. This proves once again there is zero inteligence and just interpolating the code from it's training.


What’s your vision on the replaceability of the human programmer in the next, say, 10 years?


This is just two plugins for VS code and JetBrains that allow you to summon GPT-4 or CodeLLM thus viable alternative for Copilot.

Also, Apache Licensed.

Significant amount of work of course, seems polished.

There's also tracking:

"We track:

- the steps that are run and their parameters - whether you accept or reject suggestions (not the code itself) - the traceback when an error occurs - the name of your OS - the name of the default model you configured "


I really don't understand why all of the demos for the latest AI coding tools contain a part where the tool is asked to generate documentation based on... not much.

Looking at the doc strings generated in the video I don't see how these add any value to the code on screen. Surely there are better us-cases than that.


The comments play to the strengths of LLMs – they're just text transformation and don't need any understanding of why to look passably impressive on first glance.


#Comment this is a comment


# Comment this is a comment # Comment this is a comment


Now run that line through it again.


I had the experience of Github Copilot introducing hard to catch, sometimes even security relevant bugs into my code, due to missing context. I developed a rental system web app in Django, I have a route which allows me to delete a rental, it checks if you are the creator of the rental OR the admin of the group. It was done right in the frontend, but in the backend it checked whether the request user is the rental creator OR if the creator of the rental is the group admin..

That really made me a freak out for a second, I started rechecking all the code I autocompleted with Copilot.


It would become acceptable if the AI’s error rate is as low or lower as your own. But there is also the criticality of errors. Even if the AI’s error rate is below your own, the criticality of the errors may still be significantly higher. There is a bad zone where the error rate is low enough that you don’t thoroughly verify each and every generated piece of code, but where the error rate and/or criticality is still substantially higher than your own. This can be compounded by the use of the AI resulting in more code (and thus more bugs) being written in a shorter time.


Why attach this to an IDE? I would think that simply letting an LLM control a terminal and make a commit would be the better approach.

I recently tried it with "You are in a directory with a web app that does ... and you want to implement feature .... In each step, you can use ls or any other bash command and I will give you the output".

It was pretty hillarious how the LLM found its way around the codebase with ls, cat, find, grep awk and actually even managed to edit the code that way and do a commit.

Giving the LLM a bit better tools, like a version of "cat" that prefixes the lines with numbers, and a "swap" command that can do "swap 179 250 ..." to swap out the lines 179 to 250 with "..." would probably be enough to empower the LLM to be pretty efficient.

The next step might be to let the LLM manage its context window by allowing it to remove the last output with a command like "forget". So when the LLM does "cat somefile" and realizes that the output is not interesting, it can follow up with "forget" so the output will be replaced with "You deemed the output to be not interesting".

Those tools would probably evolve to make coding and managing the context window more and more efficient. Like "nicecat 100 200" to see the lines 100 to 200 with numbers prefixed. "keep 200 300" to forget the last output except lines 200 to 300 etc.


> Giving the LLM a bit better tools, like a version of "cat" that prefixes the lines with numbers, and a "swap" command that can do "swap 179 250 ..." to swap out the lines 179 to 250 with "..." would probably be enough to empower the LLM to be pretty efficient.

ed is the standard (LLM) editor


How would ed help?

Say one of the files is talk.py and there is a function inside of it at line 17 which the LLM wants to change:

    17 def hello():
    18     print ("Hello!")
My approach would be to provide the LLM with a swap command, which takes filename, first line and last line and then treats the rest of the standard input as the new content:

    swap talk.py 17 18
    def hello(name):
        print(f"Hello {name}!")
My feeling is that this is the most efficient way for an LLM to code.


This is how https://aider.chat works; it gives you answers in the form of git commits that include the prompt and a description of what it did.

It's an interesting UX.


> In each step, you can use ls or any other bash command and I will give you the output

If you're using the OpenAI API, you can also automate this using the "tools"/"tool_calls" function call feature. https://platform.openai.com/docs/api-reference/chat/create


> It was pretty hillarious how the LLM found its way around the codebase with ls, cat, find, grep awk and actually even managed to edit the code that way and do a commit.

Please do write more. I'd love to replicate that. I keep coming across applications, where there is more "looking around the code" work than changing it. Even a slight help could mean a lot here!


Yeah I am really waiting for these tools to be proactive. I don't find the idea of asking if there is a but very sensible. It should tell me when it looks like I wrote a bug. Or if it has a suggestion for a refactor.


> Why attach this to an IDE? I would think that simply letting an LLM control a terminal and make a commit would be the better approach.

This is a recipy for disaster imo, LLMs are not that good yet to give them the ability to act on their own


Depends on the model you use. The tiny quantized 7B models, yeah its like having a high functioning teenage intern...but there are bigger more sophisticated models...like Falcon 180B...you need hella RAM to run that bad boy though.


In thinking that of course you would review the commit and fix it if broken, you don't need to blindly trust it. The commit is just another way to see the code changes better


Let it work on a fork and see what it comes up with.


Looks like the open interpreter project might be what you’re looking for.


Something like this with low latency, high accuracy voice recognition would be great. Pointing at a variable with the cursor and asking "what's this for?" could be much easier than writing out an entire sentence.


I get your point but sitting in my office talking to my computer is some scifi stuff I wouldn't feel comfortable doing yet :D


The ability to use self-hosted LLMs is a game-changer for me. Thank you!

Can it integrate with Emacs or Vim?

If not, does anyone know of a similar project that does?


Continue has a "headless" mode which might be what you need to integrate into other IDEs. The fact that it already supports two different IDEs is a good sign that support for more can be added in the future.


Out of interest, what are you self-hosting and where? Also isn't the core product here the tuning of an LLM to this use-case, if so how does that work with self-hosting?


> Out of interest, what are you self-hosting and where?

Nothing yet, but I've been playing around with a few models on vast.ai, and want to run them in my homelab soon. The low price of used 3090s and the increased performance and efficiency of new open source models makes this relatively accessible now.

> Also isn't the core product here the tuning of an LLM to this use-case, if so how does that work with self-hosting?

The core product here seems to be the ability to use any LLM inside an IDE. I've been avoiding proprietary LLMs for this purpose, so I'm interested in a solution that integrates with local models.


Thanks, and yeah I did a bit more reading and this does seem to be much more about the IDE integration than the model. I'm curious as to why, because the IDE integration seems to be the "easy bit" – I've used 3 different ones now and they're all about the same. I'd have expected that the main lever the product has in being better than others is having a custom model that understands code edits much more than others.


> I'd have expected that the main lever the product has in being better than others is having a custom model that understands code edits much more than others.

True, but this is not something this particular product would solve. There are already models specifically trained to work on code. What's appealing to me is the flexibility of being able to choose which one to use, rather than my workflow being tied to a specific product or company.

> the IDE integration seems to be the "easy bit"

I admittedly haven't researched this much, but this is not currently the case. There is no generic API for LLMs that IDEs can plug into, so all plugins must target a specific model. We ultimately need an equivalent of a Language Server Protocol for LLMs, and while such a project exists[1], it looks to be in its infancy, understandably so.

[1]: https://github.com/huggingface/llm-ls


Been using it for some months. Totally amazing!


Can you share more. For example with which model are you using it, how much does it cost per month. What do you think is the most helpful aspect of it?


I'm using it in VS Code to help me write my Flutter app. Used their preview model (ChatGPT 3.5?) for most of the time, now I switched to my own ChatGPT 4 API key for having more context. It replaced StackOverflow for me and it helped me to learn Flutter, as it was able to explain its own code to me. Really great!


that defeats whole purpose of using opensource llm tho. but last time i tried using phind, it failed miserably.


I spent a while trying to install and run this but it never worked. Seems like it might be an interesting tool.

I've had more luck with https://aider.chat although for whatever reason I haven't found it useful enough to actually use.


Half-OT: anyone building an agent based coding team you can talk to via issues/PRs?


There are some that respond to easy to answer issues in projects like dosu.dev but they aren't at a level where they can actually create fixes or features to complex codebases without significant engineer review and changes which just ends up increasing burden rather than reducing it. Best the top tier llm's can manage right now is minor tech support.


This is spot on (I'm the founder of Dosu).

IMO, we are quite far from the capabilities of reliable issue -> PR generation yet even though it seems in reach.

For this reason, Dosu is focused on question answer, triaging, and generally context gathering because we see it as a pre-requisite to PR generation.


Ctrl+F "copilot". Nothing.

My only request for these tool-makers is: don't be shy. Tell me why I will stop paying for GitHub Copilot and pay for your tool. Make me sign up ASAP for a trial. Please don't pretend Copilot doesn't exist in the same universe. You probably have 5 seconds max (and I don't say this out of self-importance. I say this out of being a desperate little piece-of-shit who eliminates everything from their life that doesn't keep them on the path to their already less-than-ideal productivity :( I just can't afford that attention you demand)


Why use Copilot?


It helps reduce the number of hoops I have to jump through. Previously : hit a problem, use Google search or read the official docs, StackOverflow, etc; Now : ask Copilot and assess if the solution makes sense to me. Often, it does.


I used it a few months ago and didn't find it that helpful, is it immensely better with the latest update? I think it had one?


It’s a limited resource about as useful as stack overflow, with nearly the same pitfalls. However, in the hands of a developer who would otherwise lack the confidence in making changes, it may be quite a bit worse.


I dont want to write in an extra window for code completion.

Maybe its just me but chatting in a separate window... whats the point ? I can just open a tab in a browser with chatgpt.

What I want is something that understand my code base and can give me tipps of wich functions to use or where to find stuff and so on. Thats what cost me time.

Understanding someone else code or my old code. That takes time.

Writing code tests is what takes away from my time.

Writing a bad version of bubble sort is not what I want.


Looks pretty good. I wonder how intuitive it'll feel to work with the chat panel open at the side — that'll take some adjustment, but I might give it a go and see how I get on. See if there's a net productivity increase and what the cost is versus just using ChatGPT+ when stuck/exploring.


There are so many of these tools now that it's impossible to chose


It's weird that this doesn't obviously say what languages it handles. I see examples in TypeScript and Python, but nowhere is a list of supported languages.


Anyone try this and can compare it to Cody by sourcegraph? Keen to start using one of these tools but not sure which is best.


I'm using Codeium and it's good. I only tried one other one which didn't turn out to be as free as you'd think (I don't remember the name, probably for the better)

Especially for the free ones, just install one and try it for a day or two


Why don't you try them and figure out which is best? Someone'll have to do it...


Cody Pro is free till February. You can try it and be your own judge.


That doc-string generation part in their video is a hilarious example how to write bad comments


How good is it when compared with Copilot?


I tried both and went for Continue. It lives in its own pane, much clearer how to interact with it.


Pricing? Can't see it on website





Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: