This has been covered here before[0]. And I remember the questionable approach to add some of its dependencies[1]. I like meilisearch and used it in one project, but downloading some binary at runtime isn't really trustworthy IMHO.
The windows version is fixed, the posix version is depending on whatever meilisearch decides. A digital signature check would be great for supply chain security, eg. with cosign[0]. meilisearch prob won't make any effort there, since they want customers on their cloud offering.
My experience with these AI coding tools so far has been that while they're great for writing functions and explaining hard to understand errors, they’re really let down by the fact they don’t have high level context of the project or business you’re working in.
It seems it’s very hard to make good decisions about how to higher level things (modules, classes, dependency hierarchies etc.) without that context, and the programmer is forced to give the tool very specific instructions in order to compensate. At some point the instructions just need to be specific enough that you might as well be writing code again.
I have no doubt they'll get there eventually, but it seems like being able to write entire projects effectively might coincide with the arrival of true AGI. There's just so much context to consider.
> they're great for writing functions and explaining hard to understand errors
Are they? I'd be interested to hear your experience on this. So far for me they have only really been able to summarise what I could find from the top few results searching online. They do a good job of summarising that, and might be quicker, but that's been it.
However, when I encounter an actually tricky issue, like a threading bug or a null pointer exception/type error sort of generic issue that's 5 levels removed from its source, these tools never manage it. Despite prompting saying I don't need a NullPointerException explained to me, figure out how this is null, the results are poor.
This might be my biases speaking, but it really does feel like I'm speaking to something that's good at transforming words and paragraphs into different formats but which has no actual understanding of the code.
I've had success with asking "what are some possible bugs with this block of code".. Sometimes it spots errors that I wouldn't have thought of or at least gives some ideas for things to check.
Fixing tricky bugs often requires collecting additional information - stepping through code, looking at values of variables and making sure they are what you expect, etc. It's an iterative process and AI tools would need to be able to do the same thing - most humans wouldn't be able to solve errors like that just by looking at the code, and neither would an LLM.
I see the same issues with people who think AI is going to make scientific discoveries - it can't do that because making discoveries requires collecting data until something is certain or we have a clear picture. At that point, you don't need AI. AI won't be making discoveries until we can automate that entire process of forming a hypothesis, testing it / doing experiments, collecting data, refining your hypothesis, etc.
Yeah, I guess that there's just not a lot of value for me in "what does this error message mean" because once you've worked with a language or framework for any reasonable amount of time you learn them and they sort of disappear into the background. LLMs do seem good for learning new systems though.
I just throw all relevant source code files (entirely) on it, paste the error message and it usually shows me what's going on. Or at least it utters a hunch, which is a good next step to find the error.
But where's the limit? Say I have a really large project and the function in question is calling dozens of functions from all over the place. The time it takes me to follow each call chain and copy all the code probably takes longer than just debugging the problem myself.
I was hoping the GitHub or intelliJ integration of copilot would automate this, especially the latter has excellent static analysis of your code and could automatically provide the AI with relevant context, but they just don't.
Even when asking it to just annotate a function and specifically ask it to document any special cases or oddities it might have, I never got much more than e.g. updateFroobCache() annotated with "updates the froob cache". Wow thanks.
Start high level then loop it for additional context with each one being a more compressed summary? Symbols, then introspection and expand/compress until there's enough context within the window that is summarized for intelligence.
These tools are quite good when you need to write code in a language/framework you are not really familiar with. At the very least for scaffolding it saves a significant amount of time.
I think these situations are a bit paradoxal. If you don’t know the language, you can’t tell if it’s actually a good solution in that language, and I’ve seen so many bad answers I’d be concerned if I wasn’t familiar with the language.
I feel the opposite. I seldom ask them for anything directly but they are amazing at understanding the context and autocomplete highly app-specific code I was about to write anyway.
> they’re really let down by the fact they don’t have high-level context of the project or business you’re working in.
Currently, you need to treat your LLM like it is a junior programmer. AI-coding tools and junior programmers will not give you the code you want if you don't write a detailed prompt. In my experience, however, AI coding tools will provide you with something closer to what you want than a junior programmer would.
I am also very disappointed that it will generate outdated code, like JS code that uses var or code that looks like 10+ years old because it does not use recent Array,String or DOM APIs, I can tal it to rewrite it but imagine all the newbs that will use this tools and use outdated code and APIs.
This proves once again there is zero inteligence and just interpolating the code from it's training.
This is just two plugins for VS code and JetBrains that allow you to summon GPT-4 or CodeLLM thus viable alternative for Copilot.
Also, Apache Licensed.
Significant amount of work of course, seems polished.
There's also tracking:
"We track:
- the steps that are run and their parameters
- whether you accept or reject suggestions (not the code itself)
- the traceback when an error occurs
- the name of your OS
- the name of the default model you configured
"
I really don't understand why all of the demos for the latest AI coding tools contain a part where the tool is asked to generate documentation based on... not much.
Looking at the doc strings generated in the video I don't see how these add any value to the code on screen. Surely there are better us-cases than that.
The comments play to the strengths of LLMs – they're just text transformation and don't need any understanding of why to look passably impressive on first glance.
I had the experience of Github Copilot introducing hard to catch, sometimes even security relevant bugs into my code, due to missing context.
I developed a rental system web app in Django, I have a route which allows me to delete a rental, it checks if you are the creator of the rental OR the admin of the group.
It was done right in the frontend, but in the backend it checked whether the request user is the rental creator OR if the creator of the rental is the group admin..
That really made me a freak out for a second, I started rechecking all the code I autocompleted with Copilot.
It would become acceptable if the AI’s error rate is as low or lower as your own. But there is also the criticality of errors. Even if the AI’s error rate is below your own, the criticality of the errors may still be significantly higher. There is a bad zone where the error rate is low enough that you don’t thoroughly verify each and every generated piece of code, but where the error rate and/or criticality is still substantially higher than your own. This can be compounded by the use of the AI resulting in more code (and thus more bugs) being written in a shorter time.
Why attach this to an IDE? I would think that simply letting an LLM control a terminal and make a commit would be the better approach.
I recently tried it with "You are in a directory with a web app that does ... and you want to implement feature .... In each step, you can use ls or any other bash command and I will give you the output".
It was pretty hillarious how the LLM found its way around the codebase with ls, cat, find, grep awk and actually even managed to edit the code that way and do a commit.
Giving the LLM a bit better tools, like a version of "cat" that prefixes the lines with numbers, and a "swap" command that can do "swap 179 250 ..." to swap out the lines 179 to 250 with "..." would probably be enough to empower the LLM to be pretty efficient.
The next step might be to let the LLM manage its context window by allowing it to remove the last output with a command like "forget". So when the LLM does "cat somefile" and realizes that the output is not interesting, it can follow up with "forget" so the output will be replaced with "You deemed the output to be not interesting".
Those tools would probably evolve to make coding and managing the context window more and more efficient. Like "nicecat 100 200" to see the lines 100 to 200 with numbers prefixed. "keep 200 300" to forget the last output except lines 200 to 300 etc.
> Giving the LLM a bit better tools, like a version of "cat" that prefixes the lines with numbers, and a "swap" command that can do "swap 179 250 ..." to swap out the lines 179 to 250 with "..." would probably be enough to empower the LLM to be pretty efficient.
Say one of the files is talk.py and there is a function inside of it at line 17 which the LLM wants to change:
17 def hello():
18 print ("Hello!")
My approach would be to provide the LLM with a swap command, which takes filename, first line and last line and then treats the rest of the standard input as the new content:
> It was pretty hillarious how the LLM found its way around the codebase with ls, cat, find, grep awk and actually even managed to edit the code that way and do a commit.
Please do write more. I'd love to replicate that. I keep coming across applications, where there is more "looking around the code" work than changing it. Even a slight help could mean a lot here!
Yeah I am really waiting for these tools to be proactive. I don't find the idea of asking if there is a but very sensible. It should tell me when it looks like I wrote a bug. Or if it has a suggestion for a refactor.
Depends on the model you use. The tiny quantized 7B models, yeah its like having a high functioning teenage intern...but there are bigger more sophisticated models...like Falcon 180B...you need hella RAM to run that bad boy though.
In thinking that of course you would review the commit and fix it if broken, you don't need to blindly trust it. The commit is just another way to see the code changes better
Something like this with low latency, high accuracy voice recognition would be great. Pointing at a variable with the cursor and asking "what's this for?" could be much easier than writing out an entire sentence.
Continue has a "headless" mode which might be what you need to integrate into other IDEs. The fact that it already supports two different IDEs is a good sign that support for more can be added in the future.
Out of interest, what are you self-hosting and where? Also isn't the core product here the tuning of an LLM to this use-case, if so how does that work with self-hosting?
> Out of interest, what are you self-hosting and where?
Nothing yet, but I've been playing around with a few models on vast.ai, and want to run them in my homelab soon. The low price of used 3090s and the increased performance and efficiency of new open source models makes this relatively accessible now.
> Also isn't the core product here the tuning of an LLM to this use-case, if so how does that work with self-hosting?
The core product here seems to be the ability to use any LLM inside an IDE. I've been avoiding proprietary LLMs for this purpose, so I'm interested in a solution that integrates with local models.
Thanks, and yeah I did a bit more reading and this does seem to be much more about the IDE integration than the model. I'm curious as to why, because the IDE integration seems to be the "easy bit" – I've used 3 different ones now and they're all about the same. I'd have expected that the main lever the product has in being better than others is having a custom model that understands code edits much more than others.
> I'd have expected that the main lever the product has in being better than others is having a custom model that understands code edits much more than others.
True, but this is not something this particular product would solve. There are already models specifically trained to work on code. What's appealing to me is the flexibility of being able to choose which one to use, rather than my workflow being tied to a specific product or company.
> the IDE integration seems to be the "easy bit"
I admittedly haven't researched this much, but this is not currently the case. There is no generic API for LLMs that IDEs can plug into, so all plugins must target a specific model. We ultimately need an equivalent of a Language Server Protocol for LLMs, and while such a project exists[1], it looks to be in its infancy, understandably so.
Can you share more. For example with which model are you using it, how much does it cost per month. What do you think is the most helpful aspect of it?
I'm using it in VS Code to help me write my Flutter app. Used their preview model (ChatGPT 3.5?) for most of the time, now I switched to my own ChatGPT 4 API key for having more context. It replaced StackOverflow for me and it helped me to learn Flutter, as it was able to explain its own code to me. Really great!
There are some that respond to easy to answer issues in projects like dosu.dev but they aren't at a level where they can actually create fixes or features to complex codebases without significant engineer review and changes which just ends up increasing burden rather than reducing it. Best the top tier llm's can manage right now is minor tech support.
My only request for these tool-makers is: don't be shy. Tell me why I will stop paying for GitHub Copilot and pay for your tool. Make me sign up ASAP for a trial. Please don't pretend Copilot doesn't exist in the same universe. You probably have 5 seconds max (and I don't say this out of self-importance. I say this out of being a desperate little piece-of-shit who eliminates everything from their life that doesn't keep them on the path to their already less-than-ideal productivity :( I just can't afford that attention you demand)
It helps reduce the number of hoops I have to jump through. Previously : hit a problem, use Google search or read the official docs, StackOverflow, etc; Now : ask Copilot and assess if the solution makes sense to me. Often, it does.
It’s a limited resource about as useful as stack overflow, with nearly the same pitfalls. However, in the hands of a developer who would otherwise lack the confidence in making changes, it may be quite a bit worse.
I dont want to write in an extra window for code completion.
Maybe its just me but chatting in a separate window... whats the point ? I can just open a tab in a browser with chatgpt.
What I want is something that understand my code base and can give me tipps of wich functions to use or where to find stuff and so on. Thats what cost me time.
Understanding someone else code or my old code. That takes time.
Writing code tests is what takes away from my time.
Writing a bad version of bubble sort is not what I want.
Looks pretty good. I wonder how intuitive it'll feel to work with the chat panel open at the side — that'll take some adjustment, but I might give it a go and see how I get on. See if there's a net productivity increase and what the cost is versus just using ChatGPT+ when stuck/exploring.
It's weird that this doesn't obviously say what languages it handles. I see examples in TypeScript and Python, but nowhere is a list of supported languages.
I'm using Codeium and it's good. I only tried one other one which didn't turn out to be as free as you'd think (I don't remember the name, probably for the better)
Especially for the free ones, just install one and try it for a day or two
[0]: https://news.ycombinator.com/item?id=36882146
[1]: https://github.com/continuedev/continue/blob/33a0436193aa65a...