Hacker News new | past | comments | ask | show | jobs | submit login

JP here! Would love to answer your questions!

We listed a bunch of ideas for larger improvements in the blog: Instant app; Up-to-date docs; Prompt/product-first workflows; Browser IDE; Local/on-prem models; Live collaboration; Parallel-agents; Code variants; Shared context; Open source sharing; MCP marketplace; Integrated CI; Monitoring/production agents; Security agents; Sketching..

What would you like us to build?




The obvious thing would be LSP interrogation, which would allow the token context to be significantly smaller than entire files. If you have one file open, and you are working on a function that calls out to N other modules, instead of packing the context with N files, you get ONLY the sections of those files the LSP tells you to look at.


Yes! This is high on our list. Context window compression is a big deal, and this is one of the main ways to do it, IMO.

Have you tried any tools that do this particularly well?


One thing that I think would be cool, and that could perhaps be good starting point, is a TDD agent. How I imagine this working:

User (who is a developer) writes tests, and a description of the desired application. The agent attempts to build the application, compiles the code, runs the tests, and automatically feeds any compiler errors and test failures back to agent so that it can fix it's own mistakes without input of the user.

Based on my experience of current programming agents, I imagine it'll take the agent a couple of attempts to get an application that compiles and passes all the tests. What would be really great to see is an agent (with a companion application probably) that automates all those retries in a good way.

i imagine the hardest parts will be to interpret compiler output, and (this is where things get real tricky) test output, and how to translate that into code changes in the existing code base.


Yeah, this is a great workflow! What's more, agents are particularly good at writing tests, since they're simpler and mostly linear, so they can even help with that part.

As to your point of automating retries, with my last prototype I played a lot with having agents do multiple parallel implementations, and then pick the first one that works, or lets you choose (or even have another agent choose).

Have you tried any tools that have this workflow down, or at least approach it?


I have not! But I've often been frustrated when an agent gives me code that doesn't compile, and I keep thinking that would be a solvable problem. One computer program should be able to talk to the other


This is going to sound a bit odd, but I suggest you detail what your tools do well and what they struggle with. For example I love Haxe, which is a niche programming language primarily for game development.

The vast majority of the time I try to use an llm with it, the code is essentially useless as it will try to invent methods that don't even exist.

For example if you're coding agents are really only good at JavaScript and a little bit of python, tell me that front and center.


Good point! In that sense we're similar to most AI coding agents in that the languages we do well are the languages the mainstream LLMs do well. We might zoom in and add really good support for particular languages though (not decided yet), in which case we'll def mention that front and center!

Have you found any LLMs or coding agents that work well with Haxe? It might be a bit too niche for us (again, not sure yet), but I'd be very curious to see what they do well!


https://www.greptile.com/

This works well, however it literally will need to digest an entire repository. So for example if I feed it a repository for a haxe framework, it'll work much better than something like Chat GPT.


Thanks! That does look like a great tool.


In my unqualified opinion, LLMs would do better at niche languages or even specific versions of mainstream languages, as well as niche frameworks, if they were better at consultig the documentation for the language or framework, for example, the user could give the LLM a link to the docs or an offline copy, and the LLM would prioritise the docs over the pretrained code. Currently this is not feasible because 1. limited context is shared with the actual code, 2. RAG is one-way injection i to the LLM, the LLM usually wouldn't "ask for a specific docs page" even if they probably should.


100% agreed on both points. Point 1 relates to https://news.ycombinator.com/item?id=43486526 as well. It's one of the biggest challenges, though maybe it'll automatically get better through models with bigger context windows (we can't assume that though)?


Local Agent, 100%.

If I'm just exploring ideas for fun or scratching my own itch, I have no desire to be thinking about a continuous stream of expenditure happening in the background when I have an apple silicon mac with 64GB of ram fully capable of running an agentic stack with tool calling etc.

Please make it trivial to setup and use a llamafile or similar as the LLM for this.


I agree, this would be good to have soon, especially as good models keep getting smaller, and hardware keeps getting cheaper.


Your timeline is indeed crazy fast. Did you recruit the 9 others in your first week? Did you pitch and secure funding in that week too? reply


In roughly the last 2 weeks, yes. It helped that everyone involved also activated their network, so we got a multiplicative effect. Can't speak to funding for now unfortunately.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: