Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: What is the, currently, best Programming LLM (copilot) subscription?
85 points by yumraj on March 7, 2024 | hide | past | favorite | 99 comments
Primarily interested in Go, Python, Swift, Android (Java) and Web stuff. So, relatively mainstream languages, nothing too esoteric.

What have people found to be the best copilot paid subscription, at the moment?




I have been a big user of ChatGPT-4 ever since it came out. I have used in various forms. Through the chatbot, through GitHub Copilot etc. For the past year, I have looked high and low to see an alternative that performed better. Nothing came close. Until a few days ago with Anthropic released Claude 3. For software development work, it is very very impressive. I've been running GPT-4 and Claude 3 Opus side by side for a few days. And I have now decided to cancel all AI subscriptions except Claude 3.

For software development work, no other model comes close.


+1, Claude 3 Opus is a game changer.

Since OP asked about a Copilot product, we (YC W23) actually built https://double.bot to do exactly this: high quality UX + most capable models. Although we don't have a subscription right now, it's free for the time being :)


At a glance, Double seems to be similar to Sourcegraph Cody - but Cody is half the price. What does Double offer that Cody doesnt?


Don’t have a subscription? What about this? [0]

Not that I mind a subscription, I’d pay it in a heartbeat if there were a JetBrains plugin.

[0] https://docs.double.bot/pricing


We implemented the subscription a few hours ago, and still hold a very generous free tier. Unfortunately, we learnt that it's not possible to have an uncapped free tier, we saw several abusers use our API to run their own apps which is obviously against the spirit of a coding copilot :)

JetBrains plugin is on the roadmap, soon!


Awesome, I'll be sure to try it out when the JetBrains plugin is available.


Looks great! Would you have plans to release a jetbrains plugin in future?


Free is even better ;), will take a look thanks..


For those who have read Vonnegut, the Claude logo is hilarious.


Any plans for Emacs support?


How do you get access to Claude 3? It doesn’t seem to be available yet.


I've only used it through third party services but I think it's available both on their site and as an API. Can't confirm though, I'm in Europe.


https://double.bot for VS Code works really well for me. https://writingmate.ai/labs for typical browser chat ui (looks ok on mobile too).


Wasn't aware that it's not widely available but if you can't find it anywhere else, we have it available in our free tier on Double: https://double.bot


Are you planning to offer plugins for other IDEs, like IntelliJ?


Isn't it available via bedrock already?


Only Sonnet, unless there is a special release group that have access to the other models.


Not sure what bedrock is, but last time I looked at their website it claimed it isn't available in my region.

I'm in Ireland, so I interpreted that as "Not out yet".


Here is a list of countries Claude is available: https://support.anthropic.com/en/articles/8461763-where-can-...


Lots of countries in that list... but none from the EU :(


Also no Canada for some reason??


> I'm in Ireland, so I interpreted that as "Not out yet".

What I interpret it as is "Not compliant with GDPR yet".


Another option: Kagi's chat feature lets you choose Claude 3 Opus (or GPT-4 or Mistral).


You have to pay for Kagi Ultimate ($25/month) to access it I think.


claude.ai


OpenAI API key with GPT4 plus aider[1]: https://github.com/paul-gauthier/aider

For reference against the actual product called “Copilot”, I would say this is actually useful, vs Copilot which I would use words like “glorified autocomplete” and “slightly better intellisense” to describe. Only really good for saving you some rote typing periodically.

The primary limit with “aider” is the GPT4 context window, so really it’s about learning to work within that as its only learning curve.

I have been curious about sourcegraph’s “Cody” product if anyone here has tried it


re Cody - I found it very similar to Codium, Copilot, et al. Impressive when it works, but consistently has trouble identifying the right context to inject or using it effectively. At its best when writing code that does not require a lot of local context - speeding up standard operations that might need a lot of boilerplate or using publicly documented + stable APIs are where they shine.

re getting some keys - for most tasks I have not yet found a better tool than finding a chat interface to your liking, setting a fairly generic “you’re an expert programmer” system prompt with output instructions tuned to your needs, and manually adding relevant context (copy/paste) to your messages when relevant. I can’t wait for big content windows and RAG methods to improve enough to replace all that with one of assistants, but it’s just not there yet.


I love aider but it feels like a giant black box that consumes copious credits.

The output is amazing though, so I keep using it. But I feel like I have no insight or control. Maybe I should read the docs more carefully.


I use mine in earnest with the full context window filled prompting probably 20-30x a week and have not gotten a monthly bill over $5 USD in a long while. When I first started using it when GPT4 access was limited last year and the tokens were expensive it was a lot more. I think my bills were around $200 USD per month at that point


I switched from GitHub Copilot to Sourcegraph and have not wanted to switch back yet.

The output is more or less of similar quality as far as I can tell. However, what makes me like Sourcegraph's Cody more is the more advanced interface it gives me in VSCode. It allows me to do all the things I want (get quick answers to code-related questions, refactor code, write tests, and explain code that's confusing) all one hot key away.


In addition to paid subscriptions, you might want to consider running an LLM locally. One of a number of projects enabling this approach is "continue" - https://github.com/continuedev/continue


The issue with local IMO is that you'll going to have to compromise on quality. And I really don't want to waste my time talking with inferior models when better models are widely available.


Yeah, at the time of writing this comment it feels like if you want to easily convince me that your local LLM is good enough, just do a side by side comparison of prompts with your LLM vs GPT4. It doesn’t even have to be better. It just has to be kind of in the same ballpark


I'd like to try this, do you have some prompts you like to use for evaluation?


people who think that their local LLM's are in the same ballpark as GPT or Claude are delusional


Haven't found any good local model yet that rivals or come close to gpt-4


Have you tried miqu q5


I have dabbled with the quant version (exl), still not the same. Which quant/version are you using? There are so many that the output shifts dramatically!


I'm using gguf q5 and I find the output degrades going to even q4. Mistral has said they still intend to open newer bigger models, so I'm really looking forward to that.


The AI Assistant in JetBrains is the best I've encountered. In Pycharm, specifically.

I've tried using my OpenAI subscription directly to use GPT4, but copy-pasting between the browser and the lack of context (that GPT has of my code) is limiting.

I've tried Copilot, but it just produces large amounts of junk code, and it gets irritating.

JetBrain's AI Assistant is built into the IDE I use, it melds well with all the pre-existing coding assistance it offers, the conversation interface is useful and can answer questions on specific parts of code (I habitually highlight whatever I'm interested in and ask it about that), and it's good when I ask it to write tests.


I need to try it out. I have been using a mixture of the Copilot plugin for autocomplete and then a third party package that uses an API key for OpenAI. I enjoy the third party package because I can just use 3.5/4 directly with my own system prompt instead of the garbage interface for Copilot or ChatGPT directly.

Edit: Just tried it out again, honestly pretty disappointing. It works and good they released it, Github has been ignoring the Jetbrains experience. Their implementation has bloated UI and I don't like the way they handle prewritten prompts.

I really like this implementation. https://github.com/didalgolab/chatgpt-intellij-plugin


TIL that there's also Copilot plugins for Jetbrain IDEs


I don't think it gets much love. I'm using it, and ... well I guess I get to stay inside my IDE, but other than that it just gets in the way.

Then again, I haven't given it much of a fare shake perhaps. I do find AI Assistant is a lot more compelling.


You also get them for Vim, Emacs, <insert editor here> just about.


No you cannot. Copilot plugins are currently only supported for NeoVim, VS Code, and JetBrains IDEs. Emacs is not on the list.


I must be imagining copilot.el, then.


Jinx!


Phind released a new model I think about a week or two back and I added their VS code plug-in.

I really like it because it makes using GPT 4 or their proprietary model much easier than copying and pasting between a browser and VS code.

It is a paid subscription, but it includes 500 messages per day, which is way better than the time window on ChatGPT for me.


Cursor is really great. It will do codebase search as well as actually apply the code it suggest from the chat interface. Some other neat features like checking your staged changes for bugs.


One thing that's really annoying is that it doesn't offer suggestions or fixes for (compile) errors which some other plug-ins do


Copilot++ in Cursor is a beast, it's my favorite feature of this app.


ChatGPT 4 with manual explanations and code snippets seem to work best so far. IDE-integrated tools have the potential, but are really immature yet.

In particular, I haven't found any tools that can do either of the following:

- Comprehend a large codebase. All assistants seem to have a pretty narrow scope for the context, like the current file or few of its neighbors. E.g. they can write some code that already exists somewhere as a ready-to-use utility function, if this function isn't mentioned anywhere near to the location I'm working on.

- Automate anything. IDE integration is extremely minimal in everything I've seen so far. A copilot can spew a new version of the code piece that can be placed with a click, but that's the extent of it. It otherwise can't replace a human in the loop, even if the work is simple, trivial and highly repetitive. E.g. doing a regex-powered "replace all" still cannot be replaced with a natural language query.

- Do any repetitive work without making mistakes or hallucinating something. E.g. a copilot can try to generate an unit test for me and the overall structure can be quite decent, but I cannot ever be sure it won't miss or invent something if it involves creating some complex object and checking all its fields after some transformation.


This is great, going to incorporate this to our roadmap.

What else should an IDE integrated coding copilot be able to do?


Who is "our?" Are you affiliated with OP in some way?


I've been pretty happy with Codeium free offering. it does good context completions for me, and based on context, gives me the next bits I'm about to type.

I used cody at work, and the best feature for that I found was chore work - I could highlight a struct (for example) and give it instructions how to transform it and it did a good job there.


Consider Phind Model: > Phind Model beats GPT-4 at coding, with GPT-3.5 speed and 16k context (https://phind.com)

https://news.ycombinator.com/item?id=38088538


If you use JetBrains products, I've been pretty happy with their AI assistant. I feel like it's less disruptive than GitHub Copilot. If I ask it to write some code, it does a pretty decent job, otherwise it stays out of my way.


I don't think it's the best assistant by ability, but if you're like me and you bought a lifetime Tabnine subscription for $29 when it debuted on HN in 2018 (https://news.ycombinator.com/item?id=18393364), maybe reinstall it and try it out. I've been using it and I would say it is better than nothing.


The creator of tabnine also just released supermaven. I’ve liked it so far because it’s fast and holds a lot more of my project in context (300k tokens) so it’s better at using data structures or functions I’ve defined elsewhere.


What I like most about Supermaven is how it's trying to complete the diff, not the file. It knows the field I just added in another file. And it works towards a commit. I still have to figure out which files to open, and I hope there will be solutions to that.


For me, personally, Copilot is the best because you are typing and it is just suggesting. Very often I type something, wait 2 seconds for its answer (knowing it is going to get it right 80% of the time) and press tab. Sure, it misses a ton, but I see it as a great autocorrect that is always there trying to help. Now, copilot chat and everything else are very bad. I don't like them.

If you are "intentional" I usually just use ChatGPT-4 directly. I've had a super pleasant experience with Cursor, too. You can just open Cursor and say "write readme for me", it will scan your project and right a decent readme. It is also eeeeextremely goooood at "I need to know where in the codebase is doing this" and it finds. But for day to day I still prefer chatgpt 4.

So I would say:

- If you want a nice autocorrect, copilot

- If you want something that knows your project and tries to help, cursor

- If you just want something smart, chatgpt with gpt-4.

I would say copilot lets me be 15% faster daily, and gpt-4 about 30% faster (when I need it), but I need it less often - maybe twice per day.


Have you ran into issues with Copilot where it does things like:

- Doesn't close brackets / adds too many closing brackets

- Spams completions while you're writing comments

- Doesn't auto import the required libraries after accepting a suggestion

- No multi cursor mode

- Refuses to name variables

- Refuses to trigger in the middle of a line

- Lower quality code than ChatGPT (probably because they use a smaller model)

Asking because I was also a heavy Github Copilot user but after 1-2 years of constant subpar UX and seemingly no effort from the Copilot team to fix it, I just went ahead and built my own extension that fixes all of the above: https://double.bot

Also it wasn't on your list but you mentioned GPT-4 is smart. You should try the new Claude 3 Opus, we implemented it on Monday and have been getting super positive feedback from users!


I’d love to always know how much code I’m sending to a third party when using a tool like this.

What about autocompleting on sensitive files like .env or so.


Even the free ChatGPT-3.5 beats paid GitHub Copilot in code quality and problem solving.

I used Copilot for a while, and it's no more than very smart autofill and good at generating boilerplate.

ChatGPT actually solves (some, minor) problems. I kept using Copilot because of comfort, and it having the context. But I stopped using it 2-3 months ago, and haven't looked back.


Was an early adopter of Github Copilot but I quickly turned it off because of the noise. I really like AI for "gap filling" (for example, I used AI extensively to help me write my bash deployment scripts for simpatico.io) and porting tasks, both of which lend themselves to copy/paste into a browser.


When you say noise do you mean the autocomplete just spamming you all the time, like for example in the middle of writing a comment?


Yes, it would often offer unwanted and/or incorrect suggestions that I found distracting. It did not justify the 10% of useful suggestions it made. My usage pattern turns out to be quite copy/paste friendly - it leaves my editor quiet but still gives me the ability to answer "knowledge gap" type questions.


I was asking myself the same question not so long ago because Github Copilot's bugs kept bothering me and this is what got us to go into YC and build https://double.bot

If what you are looking for a high quality UX + access to the most capable models (today that'd be the GPT-4 killer, Claude 3 Opus), then I think you'll like what we've built.

Double ha similar features to all the other Copilots, code autocomplete, chat, etc but we've put particularly care around getting the small details right (even the smallest thing like making sure our autocomplete closes the brackets appropriately). If you're coming from Github Copilot you should check out this side-by-side comparison: https://docs.double.bot/copilot


A lot of us are already paying for API access from openai/azure/claude

Can double's plugin accept bring your own key like cursor.sh ?

Would love to use an extension with my favorite editor instead of adapting a whole separate editor like cursor.sh

Thanks


I'm biased, but Sourcegraph Cody has changed the way I write code.

https://sourcegraph.com/cody

And we just shipped support for Claude 3 (Opus and Sonnet) as well as using any local LLM via Ollama for completions and chat.


Does the Claude 3 support only apply to the VSCode plugin right now? I'm using the Jetbrains plugin and I don't see an option to use Claude 3.


Currently it is only VS Code, but adding support for JetBrains in the next week or so. :)


Sounds great! I just subscribed to Cody, looking forward to testing the JetBrains support.


This isn't completely answering the question you asked but https://www.blackbox.ai/ was my favorite after testing out some free ones.

I know I liked it better than TabNine, CodeGPT and Codeium in VS Code. I'm pretty sure I tested a few others from the top of list of extensions when you search the extension marketplace, but I forgot which ones.

Blackbox just seems more intuitive in use and had better answers. They don't have many followers on Twitter, I'm surprised more people aren't talking about it.


The giant typos in "assistant" throughout the page give this such an unprofessional feel.


Have a look at StarCoder 2. And here's my db of 300+ LLMs:

https://lifearchitect.ai/models-table/


So basically your db says Claude 3 Opus is the best coding LLM, correct?


I'm using supermaven these days. Big enough context to understand my full codebase well (I am often working on relatively small projects though), and it's super fast. They trained a custom model, I believe on edit sequences.

These copilots are way more useful when their knowledge context is cross-file; anything restricted to current active file or making me do work to select some file subset I'm gonna pass on.


www.phind.com is what I've found to be the best generally, using searches. Their recent 70B model seems really good.


> Phind Model beats GPT-4 at coding, with GPT-3.5 speed and 16k context (phind.com)

https://news.ycombinator.com/item?id=38088538


Duet AI from Google. I have initially tried Copilot and it was sometimes slow in responding. Duet I feel is hyper fast and it sometimes suggest elegant solutions, I don't know if it has to do with any data trained on Google's own code base, I felt this particulary with Go code.


I’m pretty happy with Continue, which is a free VS Code extension. You can use an OpenAI API key or local models. I don’t think it does autocomplete, just pair programming, but I’ve found AI autocomplete to be more of a distraction than anything, so it’s good for me.


The beta version includes autocomplete: https://continue.dev/docs/walkthroughs/tab-autocomplete


Is Claude 3 on a pay per use case or a fixed monthly fee? I’ve used it a bit with the free 5USD they have and it’s looking so so much better than gpt 4… the problem is I have the feeling it’s also extremely more expensive


Software dev is one of those careers where top performers earn a lot of money, so IMO a tool that generates higher quality code is a no brainer, even if it's 'expensive'. Similarly to a Bloomberg terminal if you are in IB (those things are $25k/mo).

That being said if you want to keep playing with Claude 3 Opus, we have it available on our free tier at https://double.bot


for that answer of greatest validity you should e. g. check out "HumanEval" LLM benchmark on HuggingFace website. That is one of the best objective source of info on this issue. Currently Claude 3 is the best, far superior to other models. (85% correct code tasks done. ChatGPT 4 has 65% aprox. This is a giant difference (35% errors vs 15% , so Claude 3 is x2.5 better in terms of code quality losses)


in my limited testing, greptile, posted here recently, was really good for understanding an existing codebase, Reich is a large part of real-world programming.


Supermaven slaps


ChatGPT 4 but I'm bias (Paid to use)


How do I get paid to use LLMs? That's the dream.


I think he means he paid to use chatgpt..


Neither Github Copilot nor GPT4 are worth your time. At best they partially guess the name of a function you're thinking about typing, at worst they give you almost a correct answer. I've been shocked by how close those models will get to almost understanding what you're attempting to do, while still fundamentally getting it wrong. Last month, after a while of realizing I was spending more time correcting suggestions than I was saving I stopped using them and will need to see some major improvements before I can feel comfortable using them again.


This resonates with my recent experience using Bard (not the latest version of Gemini). It would produce something that initially seemed surprisingly good, but then when I actually tried to run it it turned out to be totally broken. I'd ask it to fix the error; it would magically do so but then be broken in a different way. It felt like pair programming with a junior programmer who just didn't quite get it.

This was just me interacting through text prompts. I could imagine some kind of more integrated solution where you can provide some basic test cases, and the system would run those cases through code proposed by the LLM could go a long way towards improving this.

For now it seems mainly useful as a way of getting a quick first draft of some code which I then have to fix up and get fully working myself.


I see a whole lot of potential in these tools, and in some domains they are starting to deliver on some of the promise. But by and large I agree with this statement - they're actually costing me time because I have to do the research to see where they went wrong. I'm better off learning it properly and idiomatically from scratch right now.


I don't get this at all. What kind of code are you writing that you have to literally go and research what it spat out?

In my experience 90% of code is 90% the same as another piece of code in the same repo, with small differences, and copilot will make you fly writing that code.

If you can't read the output code, does it mean the rest of your codebase is similarly unreadable?

The complexity in a codebase or a system is usually from different parts integrating or an overall architecture, but that's totally different to an individual function


"What kind of code are you writing that you have to literally go and research what it spat out" - so in a recent case I was trying to work with Elasticsearch. I'm not an expert in that, so I asked it to do some things. It hallucinated a bunch, and I ended up having to dive in and learn it deeply anyway.

In that case I think I was better off not relying on the tool. I do find it nice to steer me in a direction, but the things I use tend to be niche enough that I don't get the benefit many others do.

I also have a feeling you and I are using it in different capacities.


This can happen when you are working with new codebases or APIs. For example, recently I tried to build this small gnome extension [0] but I had 0 experience with the API. So I tried chat gpt.

Even though the structure of the code in the file was ok, it called some APIs that did not exist, it created a new var `this._menu` for the dropdown that was not needed (this.menu already exists) and in the end I still had to go through gnome extensions docs to figure out how to do it right.

Overall I don't regret using it but the experience wasn't magical, as I guess we all want it.

[0]: https://github.com/onel/keyboard-cat-defense


Agreed that Github Copilot is not worth anyone's valuable time. You should check out the new Claude 3 Opus model, it's noticeably better. Right off the bat, it's less 'lazy' with its generations, and for me, it has been able to solve bugs that GPT-4 could not solve.

Just this week we made it available on https://double.bot (VS COde Copilot extension). Have been getting similar feedback from multiple users


I hadn't explicitly asked that, but that is what I've been curious about too, as in are these, currently, good enough. Which is why I wanted to start with whichever is the best, again currently, rather than bang my head against ones that are not good.

I had tried a simple experiment to generate some basic Go REST service, XML parsing code, using Bard and ChatGPT and they were actually not bad. But, that was a very simple and new code.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: