Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Review GitHub PRs with AI/LLMs (coderabbit.ai)
2 points by gillh on July 14, 2023 | hide | past | favorite | 4 comments
Hello HN readers!

Our team has built an AI/LLM-driven code review tool that significantly helps improve dev velocity and code quality.

Its unique features are:

- Line-by-line code change suggestions: Reviews the changes line by line and provides code change suggestions that can be directly committed from the GitHub UI.

- Continuous, incremental reviews: Reviews are performed on each commit within a pull request rather than a one-time review on the entire pull request.

- Cost-effective and reduced noise: Incremental reviews reduce noise by tracking changed files between commits and the base of the pull request.

- Chat with the bot: Supports conversation with the bot in the context of lines of code or entire files, helpful in providing context, generating test cases, and reducing code complexity.

- Smart review skipping: By default, skips in-depth review for simple changes (e.g., typo fixes) and when changes look good for the most part.

We would love the HN community to try it out in their GitHub repos and provide feedback! We will happily answer any technical questions regarding the sophisticated prompt engineering we did for this project.



At the time of writing, the first sample image on that page is this:

https://coderabbit.ai/assets/section-1-f9a48066.png

which recommends adding a "maxIterations" counter to the "for len(executedComponents) ..." loop here:

https://github.com/fluxninja/aperture/blob/26e00ea818c7c28da...

HOWEVER

- the review has failed to notice the logic using "numExecutedBefore" (around line 377) that already prevents the specific bug it is suggesting a fix for

- the suggested change decrements "maxIterations" inside the "for ... range circuit.components {" loop which means it isn't counting iterations, it's counting components

This kind of suggestion is particularly nasty because it's unlikely that the test suite populates enough components to hit "maxIterations" - so an inattentive reader could accept it, get a green build, and then deploy a production bug!


Good observation. We were able to prevent issues due to limited context by sending the entire file in the prompt and the results were pretty amazing. We eventually reverted the change due to 2 reasons -

1. Limited tokens: 8K tokens is sometimes not enough to hold the entire file. Perhaps 32K tokens (or 100K Claude2 tokens) can help circumvent this.

2. Cost: GPT-4 is super expensive. Our current usage is roughly $20 a day but when we send files the usage shot up to $60 a day or so.

Our hope is that both the cost and token limit improve in the future so that we can send the entire file in each review request.


It looks pretty promising, but I think one of the problems with AI checked PRs is that it gives pretty plausible sounding answers that aren't really correct, which is why StackOverflow has banned them off the platform.The line-by-line checking could help though.


It’s all about providing relevant context in the prompts and using a better model where reasoning is needed, eg gpt-4.

If you are curious, some of our prompts are in the open source.

Here is the link to our open source version - https://github.com/coderabbitai/openai-pr-reviewer




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: