More

janpaul123 · 2025-04-21T21:07:25 1745269645

Yeah, my colleague just wrote about this exact problem of incentive misalignment with Cursor and Windsurf https://blog.kilocode.ai/p/why-cursors-flat-fee-pricing-will

The economist in me is says "just show the prices", though the psychologist in me says "that's hella stressful". ;)

ricksunny · 2025-04-21T21:49:19 1745272159

Why does the psychologist in you say "that's hella stressful", e.g. Stressful for who? What is the source of their stress?

janpaul123 · 2025-04-21T22:47:57 1745275677

Seeing the $ every time I do something, even if it's $0.50, can be a little stressful. We should have an option to hide it per-request and just show a progress bar for the current topup.

kaashif · 2025-04-22T03:03:21 1745291001

I find a progress bar for a top up much more stressful than just a counter of how much I've spent.

Counting down to a deadline vs counting up to nothing.

If it costs money and spending money is stressful, there has to be a cost somewhere, we can't remove the stress entirely.

janpaul123 · 2025-03-26T20:21:43 1743020503

At this point we plan to monetize enterprise features (LDAP login, things like that).

janpaul123 · 2025-03-26T18:37:14 1743014234

Totally agree!

janpaul123 · 2025-03-26T18:33:59 1743014039

We’ll take all the features people love in other products, and implement them in a coherent package as quickly as we can.

janpaul123 · 2025-03-26T18:20:34 1743013234

JP here! Would love to answer your questions!

We listed a bunch of ideas for larger improvements in the blog: Instant app; Up-to-date docs; Prompt/product-first workflows; Browser IDE; Local/on-prem models; Live collaboration; Parallel-agents; Code variants; Shared context; Open source sharing; MCP marketplace; Integrated CI; Monitoring/production agents; Security agents; Sketching..

What would you like us to build?

arevno · 2025-03-26T19:24:39 1743017079

The obvious thing would be LSP interrogation, which would allow the token context to be significantly smaller than entire files. If you have one file open, and you are working on a function that calls out to N other modules, instead of packing the context with N files, you get ONLY the sections of those files the LSP tells you to look at.

janpaul123 · 2025-03-26T20:01:09 1743019269

Yes! This is high on our list. Context window compression is a big deal, and this is one of the main ways to do it, IMO.

Have you tried any tools that do this particularly well?

amarant · 2025-03-26T19:31:05 1743017465

One thing that I think would be cool, and that could perhaps be good starting point, is a TDD agent. How I imagine this working:

User (who is a developer) writes tests, and a description of the desired application. The agent attempts to build the application, compiles the code, runs the tests, and automatically feeds any compiler errors and test failures back to agent so that it can fix it's own mistakes without input of the user.

Based on my experience of current programming agents, I imagine it'll take the agent a couple of attempts to get an application that compiles and passes all the tests. What would be really great to see is an agent (with a companion application probably) that automates all those retries in a good way.

i imagine the hardest parts will be to interpret compiler output, and (this is where things get real tricky) test output, and how to translate that into code changes in the existing code base.

janpaul123 · 2025-03-26T20:04:12 1743019452

Yeah, this is a great workflow! What's more, agents are particularly good at writing tests, since they're simpler and mostly linear, so they can even help with that part.

As to your point of automating retries, with my last prototype I played a lot with having agents do multiple parallel implementations, and then pick the first one that works, or lets you choose (or even have another agent choose).

Have you tried any tools that have this workflow down, or at least approach it?

amarant · 2025-03-27T02:28:11 1743042491

I have not! But I've often been frustrated when an agent gives me code that doesn't compile, and I keep thinking that would be a solvable problem. One computer program should be able to talk to the other

999900000999 · 2025-03-26T20:03:19 1743019399

This is going to sound a bit odd, but I suggest you detail what your tools do well and what they struggle with. For example I love Haxe, which is a niche programming language primarily for game development.

The vast majority of the time I try to use an llm with it, the code is essentially useless as it will try to invent methods that don't even exist.

For example if you're coding agents are really only good at JavaScript and a little bit of python, tell me that front and center.

janpaul123 · 2025-03-26T20:13:25 1743020005

Good point! In that sense we're similar to most AI coding agents in that the languages we do well are the languages the mainstream LLMs do well. We might zoom in and add really good support for particular languages though (not decided yet), in which case we'll def mention that front and center!

Have you found any LLMs or coding agents that work well with Haxe? It might be a bit too niche for us (again, not sure yet), but I'd be very curious to see what they do well!

999900000999 · 2025-03-26T20:45:44 1743021944

https://www.greptile.com/

This works well, however it literally will need to digest an entire repository. So for example if I feed it a repository for a haxe framework, it'll work much better than something like Chat GPT.

janpaul123 · 2025-03-26T21:25:17 1743024317

Thanks! That does look like a great tool.

Zondartul · 2025-03-26T20:38:54 1743021534

In my unqualified opinion, LLMs would do better at niche languages or even specific versions of mainstream languages, as well as niche frameworks, if they were better at consultig the documentation for the language or framework, for example, the user could give the LLM a link to the docs or an offline copy, and the LLM would prioritise the docs over the pretrained code. Currently this is not feasible because 1. limited context is shared with the actual code, 2. RAG is one-way injection i to the LLM, the LLM usually wouldn't "ask for a specific docs page" even if they probably should.

janpaul123 · 2025-03-26T21:26:51 1743024411

100% agreed on both points. Point 1 relates to https://news.ycombinator.com/item?id=43486526 as well. It's one of the biggest challenges, though maybe it'll automatically get better through models with bigger context windows (we can't assume that though)?

eutropia · 2025-03-27T03:44:12 1743047052

Local Agent, 100%.

If I'm just exploring ideas for fun or scratching my own itch, I have no desire to be thinking about a continuous stream of expenditure happening in the background when I have an apple silicon mac with 64GB of ram fully capable of running an agentic stack with tool calling etc.

Please make it trivial to setup and use a llamafile or similar as the LLM for this.

janpaul123 · 2025-03-27T13:27:42 1743082062

I agree, this would be good to have soon, especially as good models keep getting smaller, and hardware keeps getting cheaper.

spankalee · 2025-03-26T19:41:47 1743018107

Your timeline is indeed crazy fast. Did you recruit the 9 others in your first week? Did you pitch and secure funding in that week too? reply

janpaul123 · 2025-03-26T20:05:23 1743019523

In roughly the last 2 weeks, yes. It helped that everyone involved also activated their network, so we got a multiplicative effect. Can't speak to funding for now unfortunately.

janpaul123 · 2025-03-26T17:58:18 1743011898

And yet few do it!

achierius · 2025-03-26T20:12:34 1743019954

Is that true? Plenty of fast-moving communities have achieved amazing things, both in computing and outside of it.

janpaul123 · 2025-03-26T20:16:58 1743020218

Still a small percentage! Let's get more of this happening.

janpaul123 · 2025-03-26T17:56:39 1743011799

Our backers have no interest in fake metrics. ;) It's a good way to quickly get feedback, which is key to our strategy. Totally fine to keep using Roo Code (or Cline) of course!

janpaul123 · 2025-03-26T17:53:48 1743011628

You're too kind!! The speedrun ethos has already been super fun with this team. :)

I hope that we'll also be able to bring enough skills, strategy, and taste to the space. Time will tell, but we're giving it our best shot!

janpaul123 · 2024-08-23T20:52:56 1724446376

Yes exactly. It generates code for Val Town, which hosts Typescript backends.

ilaksh · 2024-08-23T20:59:58 1724446798

I think that this is kind of an obvious "optimization" for making application generation much more reliable. Just because the models can generate code for one of 1000 different platforms, doesn't mean that you need all of them. Just by narrowing the scope to a particular platform makes it much more feasible to get working applications without needing manual debugging due to out of date library references etc.

I think something like the approach you have demonstrated here will relatively quickly become the standard for "no-code" application development.

janpaul123 · 2024-08-23T21:26:05 1724448365

Completely agree. It's useful not just for targeting one specific language, but all the other APIs that we have, and things like RAG to search for importable modules on the platform. Duplicating all that across many platforms is a lot of work!

janpaul123 · 2024-08-23T20:39:18 1724445558

Post author here! Happy to answer any questions.