Does your work not depend on existing code bases, product architectures and nont...

_acco · 2024-08-21T00:23:53 1724199833

Context is the most challenging bit. FWIW, the codebases I'm working on are still small enough to where I rarely need to include more than 12 files into context. And I find as I make the context bigger beyond that, results degrade significantly.

So I don't know how this would go in a much larger codebase.

What floored him was simply how much of my programming I was doing with an LLM / how little I write line-by-line (vs edit line-by-line).

If you're really curious, I recorded some work for a friend. The first video has terrible audio, unfortunately. This second one I think gives a very realistic demonstration – you'll see the model struggle a bit at the beginning:

https://www.loom.com/share/20d967be827141578c64074735eb84a8

mhuffman · 2024-08-21T02:27:13 1724207233

I know that you are getting some push-back because of your exuberance regarding your use of LLMs in development, but let me just say I respect that when someone told you to "put up or shut up" you did. Good on you!

wokwokwok · 2024-08-21T02:10:52 1724206252

So you spend 10 minutes writing a free text description of the test you want; tell it exactly how you want it to write the test, and then 4-5 minutes trying to understand if it did the right thing or not, restart because it did something crazy then a few minutes manually fixing the diff it generated?

MMmm.

I mean, don't get me wrong; this is impressive stuff; but it needs to be an order of magnitude less 'screwing around trying to fix the random crap' for this to be 'wow, amazing!' rather than a technical demonstration.

You could have done this more quickly without using AI.

I have no doubt this is transformative technology, but people using it are choosing to use it; it's not actually better than not using it at this point, as far as I can tell.

It's slower and more error prone.

_acco · 2024-08-21T02:59:20 1724209160

Stoked you watched, thanks. (Sorry the example isn't the greatest/lacks context. The first video was better, but the mic gain was too high.)

You summed up the workflow accurately. Except, I read your first paragraph in a positive light, while I imagine you meant it to be negative.

Note the feedback loop you described is the same one as me delegating requirements to someone else (i.e. s/LLM/jr eng). And then reading/editing their PR. Except the feedback loop is, obviously, much tighter.

I've written a lot of tests, I think this would have taken 3-4x longer to do by hand. Surely an hour?

But even if all things were roughly equal, I like being in the navigator seat vs the driver seat. Editor vs writer. It helps me keep the big picture in mind, focused on requirements and architecture, not line-wise implementation details.

ruszki · 2024-08-21T06:30:45 1724221845

> I've written a lot of tests, I think this would have taken 3-4x longer to do by hand. Surely an hour?

It seems to me that the first few are almost complete copy-paste of older tests. You would have got code closer to the final test in the update case with simple copy-paste than what was provided.

The real value is only in the filtered test to choose randomly (btw, I have no idea why that’s beneficial here), and the one which checks that both consumers got the same info. They can be done in a few minutes with the help of the already made insert test, and the original version of the filtered test.

I’m happy that more people can code with this, and it’s great that it makes your coding faster. It makes coding more accessible. However, there are a lot of people who can do this faster without AI, so it’s definitely not for everybody yet.

wokwokwok · 2024-08-21T03:29:35 1724210975

> I've written a lot of tests, I think this would have taken 3-4x longer to do by hand. Surely an hour?

I guess my point is I'm skeptical.

I don't believe what you had the end would have taken you that long to do by hand. I don't believe it would have taken an hour. It certainly would not have taken me or anyone on my team that long.

I feel like you're projecting that, if you scale this process, so say, having 5 LLMs running in parallel, then what you would get is you spending maybe 20% more time reviewing 5x PRs instead of 1x PR, but getting 5x as much stuff done in the end.

Which may be true.

...but, and this is really my point: It's not true, in this example. It's not true in any examples I've seen.

It feels like it might be true in the near-moderate future, but there are a lot of underlying assumptions that is based on:

- LLMs get faster (probably)

- LLMs get more accurate and less prone to errors (???)

- LLMs get more context size without going crazy (???)

- The marginal cost of doing N x code reviews is < the cost of just writing code N times (???)

These are assumptions that... well, who knows? Maybe? ...but right now? Like, today?

The problem is: If it was actually making people more productive then we would see evidence of it. Like, actual concrete examples of people having 10 LLMs building systems for them.

...but what we do see, is people doing things like this, which seem like (to me at least), either worse or on-par with just doing the same work by hand.

A different workflow, certainly; but not obviously better.

LLMs appear to have an immediate right now disruptive impact on particular domains, like, say, learning, where its extremely clear that having a wise coding assistant to help you gain simple cross domain knowledge is highly impactful (look at stack overflow); but despite all the hand waving and all the people talking about it, the actual concrete evidence of a 'Devin' that actually builds software or even meaningfully improves programmer productivity (not 'is a tool that gives some marginal benefit to existing autocomplete'; actually improves productivity) is ...

...simply absent.

I find that problematic, and it makes me skeptical of grand claims.

Grand claims require concrete tangible evidence.

I've no doubt that you've got a workflow that works for you, and thanks for sharing it. :) ...I just don't think its really compelling, currently, to work that way for most people; I don't think you can reasonably argue it's more productive, or more effective, based on what I've actually seen.

iluvcommunism · 2024-08-21T02:25:16 1724207116

I used it for some troubleshooting in my job. Linux Sys Admin. I like how I can just ask it a question and explain the situation and it goes thru everything with me, like a Person.

jlarocco · 2024-08-21T02:49:17 1724208557

How worried are you that it's giving you bad advice?

There are plenty of, "Just disable certificate checking" type answers on Stack Overflow, but there are also a lot of comments calling them out. How do you fact check the AI? Is it just a shortcut to finding better documentation?

iluvcommunism · 2024-08-21T17:58:32 1724263112

In my opinion it’s better at filtering down my convoluted explanation into some troubleshooting steps I can take, to investigate. It’s kind of like an evolved Google algorithm, boiling down the internet’s knowledge. And I’ve had it give me step by step instructions on “esoteric” things like dwm config file examples, plugins for displaying pictures in terminal, what files to edit and where…it’s kind of efficient I think. Better than browsing ads. Lol.

ainonsense44 · 2024-08-21T05:27:13 1724218033

> Certainly, yes. To initialize the IPv6 subsystem, enter 'init 6' in an elevated system terminal.

lanstin · 2024-08-21T20:56:16 1724273776

Whoops.

novaleaf · 2024-08-21T00:29:29 1724200169

I think that Greptile is on the right track. I made a repo containing the c# source code for the godot game engine, and it's "how to do X", where X is some obscure technical feature (like how to create a collision query using the godot internal physics api) is much better than all the other ai solutions which use general training data.

However there are some very frustrating limitations to greptle, so severe that I basically only use it to ask implementation questions on existing codebases, not for anything like general R&D: 1) answers are limited to about 150 lines. 2) it doesn't re-analyze a repo after you link it in a conversation (you need to start a new conversation, and re-link the repo, then wait 20+ min for it to parse your code) 3) it is very slow (maybe 30 seconds to answer a question) 4) there's no prompt engineering

I think it's a bit strange that no other ai solution lets you ask questions about existing codebases. I hope that will be more widespread soon.

dakshgupta · 2024-08-21T00:49:24 1724201364

I work at Greptile and agree on all three criticisms. 1) is a bug we haven't been able to fix, 2) has to do with the high cost of re-indexing, we will likely start auto-updating the index when LLM costs come down a little, and 3) has to do with LLM speed. We pushed some changes to cut time-to-first-token by about half but long way to go.

Re: prompt engineering, we have a prompt guide if that helps, was that what you are getting at?

https://docs.greptile.com/prompt-guide

photonthug · 2024-08-21T01:31:49 1724203909

No idea about the product, but I would like to congratulate you guys on what is maybe the greatest name ever. Something about it seems to combine "fierce" with "cute", so I think you should consider changing your logo to something that looks like reptar

dakshgupta · 2024-08-21T01:36:56 1724204216

So glad you like it - we were worried it was too silly.

The logo is supposed to be a reptile claw but we might modify it to make that more obvious.

Havoc · 2024-08-21T10:30:56 1724236256

> What exactly floored your colleague at Microsoft?

Speaking of understand context… They floored him not other way round

fuzztester · 2024-08-20T23:42:37 1724197357

[flagged]

fuzztester · 2024-08-21T01:05:29 1724202329

Even more negativity towards this comment of mine, than to some of my other ones, as shown by the greater downvotes.

Jeepers creepers! :)

AdieuToLogic · 2024-08-21T03:29:58 1724210998

  Comments should get more thoughtful and substantive,
  not less, as a topic gets more divisive.

  Eschew flamebait. Avoid generic tangents. Omit internet
  tropes.

  Be kind. Don't be snarky. Converse curiously;
  don't cross-examine. Edit out swipes.

  Please don't comment about the voting on comments.
  It never does any good, and it makes boring reading.

source: https://news.ycombinator.com/newsguidelines.html

HTH

andrewflnr · 2024-08-21T03:45:38 1724211938

Not-so-subtly mocking the top-level for not replying "yet", when they replied almost immediately after with a video of the relevant workflow, was not a move that made you look smart or nice.

fuzztester · 2024-08-21T06:40:17 1724222417

>when they replied almost immediately after with a video of the relevant workflow

Wow. Such wrong claims.

I had already replied to you in a sibling comment, refuting your points, but will give one more proof (not that I really need to):

_acco, the top level commenter relevant to this discussion, commented at some time, say x.

layer8 commented, replying to _acco, 7 hours ago (as can be seen on the page at the time of my writing this comment, i.e. right now).

I then replied to layer8, 6 hours ago.

_acco replied back to layer8 5 hours ago.

All this is visible right now on the page; and if people check it a few hours later, the relative time deltas will remain the same, obviously. (But not if they check after 24 hours, in which case all comments will show as one day ago.)

So there was a 1 hour gap between layer8's comment and mine, and a 2 hour gap between layer8's comment and _acco's reply.

If you think 2 hours is the same as "almost immediately", as you said above, I have nothing more to say to you, except that our perceptions of time are highly different.

andrewflnr · 2024-08-21T13:47:46 1724248066

I meant immediately after your reply. At the time I posted, your and acco_'s replies to layer8 both showed as "3 hours" ago. Now they both show as "13 hours ago". Really, I'm being generous in assuming they didn't reply before you.

Ed: ah, since the time I wrote this comment, your respective comments are now at 14 and 13 hours. Congrats on your <1hr lead.

fuzztester · 2024-08-22T05:23:44 1724304224

By god, andrewflnr. Very "nice". /s. See point 4 below.

You just showed that you are inaccurate, pompous and fake, all of that, in one single comment of yours, above. How? I'll tell you:

1. inaccurate: That commenter's username (the one who started this subthread) is _acco, not acco_ as you wrote above.

Check that in their comment, or here:

https://news.ycombinator.com/user?id=_acco

I was careful to check the spelling before mentioning their name, unlike you, even when I referred to them earlier. The fact that you cannot even get the position of an underscore in a name, correct, seems to indicate that you are sloppy. which leads me to my next point.

2. pompous:

You said:

>Really, I'm being generous in assuming they didn't reply before you.

This is the pompous bit. Generous? Laughable.

I neither need not want your generosity. If anything, I prefer objectivity, and that people give others the benefit of the doubt, instead of assuming bad intentions by them: I had actually checked for a second comment by _acco (in reply to layer8) just before I wrote my comment to layer8, the one that got all of you in a tizzy. But you not only got the times wrong (see your edit, and point 3 below), but also assumed bad faith on my part.

3. fake.

You first said above that both those replies to layer8 showed as 13 hours ago, then edited your comment to say 14 and 13 hours. It shows that you don't use your brains. The feature of software showing time deltas in the form of "hours ago" or "days ago", versus an exact time stamp, is pretty old by now. It dates back to Web 2.0 or earlier, maybe it was started by some Web 2 startups or by Google.

If you think you are so clever as to criticize me without proof, or say that you are generous in your assumptions about me, you should have been equally clever or generous about the time delta point above, and so realized that I could have replied to layer8 before _acco, which was indeed the case. Obviously I cannot prove it, but the fact that I got _acco's name correct, while you did not, lends credence to my statement. It shows that I took care while writing my comment.

4. So you are fake because you don't bother to think before bad-mouthing others, and even more fake because you did not apply (to yourself) your own made-up "rule" in this other comment of yours, where you criticized my comment as being neither smart nor nice, so not of value:

https://news.ycombinator.com/item?id=41310460

I should not have had to write such a long comment to refute your silly and false allegations, and I will not always do that, but decided to do it this time, to make the point.

And, wow: you managed to pack 3 blunders (being inaccurate, pompous and fake) into a comment of just a few lines. That's neither smart not nice. Instead, it's toxic.

fuzztester · 2024-08-22T05:51:39 1724305899

Actually, your inaccurateness (inaccuracy? GIYF) is even worse than I said above. My comment of a few levels above, literally uses the name _acco at least four times - I checked just now. And your comment was in reply to that. So even after reading that person's name four times in my comment, you still got its spelling wrong. Congrats. (Yeah, I can snark too, like you did to me upthread.)

andrewflnr · 2024-08-23T03:12:41 1724382761

You seem to have gotten so worked up over my misplaced underscore that you yourself forgot how those ubiquitous rounded timestamps work. When I first wrote my comments, they were indeed the same, 3 and later 13 hours. After I wrote my later comment, in the few minutes between times I looked at it, the timestamp on yours just happened to cross the threshold where it rounded up to 14 instead of down to 13. (And if I was "sloppy", do you really think I would have looked again and corrected my comment?) Presumably if I looked at them a bit later they would have both said 14 hours. Hence <1 hour lead.

Anyway, yeah, I worry less about being nice to people who've already shown themselves to be clowns, in a sub thread that's flagged to death. You got me there. FWIW I was originally hoping to enlighten you a bit as to why you were being downvoted, as a small help to you.

xenic · 2024-08-21T12:17:47 1724242667

2 hours in a discussion forum, where the discussion spans days or sometimes weeks is certainly an ”almost immediate” response.

Perception of time is subjective.

fuzztester · 2024-08-22T04:26:45 1724300805

Err ... you realize that your argument just shot itself in both left feet?

I just love it (not!, but it happens every now and then) when HN users display ignorance of basic logic, on a site for techies, yet.

I could use your own argument against you:

>is certainly an ”almost immediate” response

>Perception of time is subjective.

So, you use "certainly", and then "subjective", just after it, in the same argument about the same topic?

Brilliant. Do you not realize that by your last sentence above, you make my own case for me?

fuzztester · 2024-08-21T04:20:49 1724214049

Hey.

These are the four simple lines that I wrote above:

>Solid questions and comments, layer8.

>I notice that the person you replied to has not replied to you yet.

>It may be that they will, of course.

>But your points are good.

(Italics mine, and they were not in my original comment.)

You, above:

>when they replied almost immediately after with a video of the relevant workflow,

I did check the time intervals between the top level comment and layer8's comment, before my first reply. It is over an hour now, so I cannot see the exact times in minutes any more, but IIRC, there was a fairly long gap (in minutes). And I also think I noticed that the top level person did reply to someone else, but not to layer8, by the time I wrote my comment.

So I don't see anything wrong in what I said. I even said that they may reply later.

You consider that to be:

>"Not-so-subtly mocking"?

Jeez. I think you are wrong.

Then I have nothing further to say to you, except this:

>was not a move that made you look smart or nice.

Trying to look smart or nice is not the goal in online discussions. At least, I don't think so. You appear to think that. The goal (to me) is to say what you think, otherwise, why write at all? I could just get an LLM to write all my comments, and not care about its hallucinations.

I don't try to be smart or nice, nor the reverse. I just put my considered thoughts out there, just like anyone else. Obviously I can be right or wrong, just like anyone else can be. And some points can be subjective, so cannot be said to be definitely either right or wrong.

andrewflnr · 2024-08-21T14:08:04 1724249284

If a comment is not at least one of smart or nice, it's a waste of space and attention. That may not be your purpose, but don't act shocked when people respond with negativity.

cachehit · 2024-08-21T04:54:02 1724216042

You were being passive aggressive and adding nothing to the discussion. The fact that you weren't wrong doesn't make any difference.

fuzztester · 2024-08-21T06:06:20 1724220380

>You were being passive aggressive and adding nothing to the discussion.

What arrant nonsense!

Don't accuse people falsely or without substantiating your accusation.

My comment being referred to above, consisted of four simple short lines.

I am not going to bother to paste them here again for a third time.

Those HN readers who wish to do so, can go the few needed levels up this thread and check for themselves:

Nothing I said in those four lines can be interpreted as being passive-aggressive.

I say that you do not know what the heck you are talking about, if you claim that I was passive-aggressive in that comment.

smokel · 2024-08-21T07:57:58 1724227078

Placing statements on separate lines does come off as somewhat aggressive. It may be subtle, but I got a similar impression.

fuzztester · 2024-08-22T22:15:29 1724364929

Your comment is so hilarious that I need some time to analyse its implications and think of a suitable reply.

Mañana.