Is there examples / case studies of more complex apps being built by LLMs? I've seen some interesting examples but they were all small and simple examples. I'd love to see more case studies of how well these tools perform in more complex scenarios.
My gut feeling is we're still a few LLMs generations away from this being really usable but I'd love to hear how the authors are thinking about this.
Can you give an example of complex? I’ve used ChatGPT to help me build an app that authenticates a user using Oauth. That information creates a user in the backend (Rails). That user can then import issues tagged with specific information from a 3rd party task management tool (Linear). The title for these issues are then listed in the UI. From there, the user can create automatic release notes from those issues. They can provide a release version, description, tone, audience, etc.
All of that (issues list, version, tone, etc) is then formulated into a GPT prompt. The prompt is structured such that it returns written release notes. That note is then stored and the user can edit it using a rich text editor.
Once the first note is created the system can help the user write future notes by predicting release version, etc.
This isn’t that complex imo, but I’m curious to see if this is what people consider complex.
How about a 2 million line legacy app spanning 5 languages including one created by a guy who left the company 14 years ago which has a hand-rolled parser and is buggy.
A Line Of Business app. With questionable specs. Where inputs are cross dependant and need to be filtered. Some fields being foreign keys to other models.
> Line of business (AKA LOB) is a term that describes a business’s product or service, the resources used, and the process for delivering value to a market segment. It could be the primary or one of the main processes that bring revenue.
> For example, manufacturing dry-erase markers is a line of business. Everything that happens from concept, developing the markers, marketing, selling, to fulfillment, and staying competitive makes up the business line. So, a LOB could also describe a product line.
I don't have a specific definition of complex in mind. Seeing more examples of this with the prompts used + output and the overall steps is exactly what I'm asking for. I'm particularly interested in how the success rate changes as the code base evolves. Are LLMs effective in empty repos? are they effective on large repos? Can prompts be tweaked to work on larger repos?
Around 3 hours (not straight - I would hack on it for 30 minutes to an hour at a time). I spent another 1.5 hours or so styling it, but I did that outside of ChatGPT.
My gut feeling is we're still a few LLMs generations away from this being really usable but I'd love to hear how the authors are thinking about this.