bilkow's favorites | Hacker News

etamponi 6 days ago | parent | context | on: The "confident idiot" problem: Why AI needs hard r...

Aren't we just reinventing programming languages from the ground up?

This is the loop (and honestly, I predicted it way before it started):

1) LLMs can generate code from "natural language" prompts!

2) Oh wait, I actually need to improve my prompt to get LLMs to follow my instructions...

3) Oh wait, no matter how good my prompt is, I need an agent (aka a for loop) that goes through a list of deterministic steps so that it actually follows my instructions...

4) Oh wait, now I need to add deterministic checks (aka, the code that I was actually trying to avoid writing in step 1) so that the LLM follows my instructions...

5) <some time in the future>: I came up with this precise set of keywords that I can feed to the LLM so that it produces the code that I need. Wait a second... I just turned the LLM into a compiler.

The error is believing that "coding" is just accidental complexity. "You don't need a precise specification of the behavior of the computer", this is the assumption that would make LLM agents actually viable. And I cannot believe that there are software engineers that think that coding is accidental complexity. I understand why PMs, CEOs, and other fun people believe this.

Side note: I am not arguing that LLMs/coding agents are nice. T9 was nice, autocomplete is nice. LLMs are very nice! But I am starting to be a bit too fed up to see everyone believing that you can get rid of coding.

nvader 5 days ago | parent | context | on: Jujutsu worktrees are convenient (2024)

Specific workflows that I use in jj vs git:

1. Stacked PRs. I like to be kind to my reviewers by asking them to review small, logically-contained pull requests. This means I often stack up chains of PRs, where C depends on B depends on A, and A is being reviewed. If I receive feedback on A, jj enables me to incorporate that change within A, and flows those changes down into my dependent branches. I can then merge and close out A, whole continuing to work on B and C. Achieving this in raw git is labour intensive and error prone.

2. Easily fix up commits. I like to work with atomic commits, and sometimes I realize that I've made a typo in a comment, or a small error, or missed a test case. Jj makes it really trivial to timewalk back to the commit in question, fix it and resume where I left off.

3. Decompose a big PR into multiple PRs. This is the flip side of point 1: I can take my own big PR and rearrange and partition the commits into A, B and C so that they can easily be reviewed.

In general, jj seems to encourage and reward you for being disciplined with your commits by enabling you to be more flexible in how you stage, review and ship your code.

On the flip side, if you're the kind of person who is used to typing `git commit --all --message "xxx"` you might not get as much value from jj until that changes.

quirino 17 days ago | parent | context | on: Migrating the main Zig repository from GitHub to C...

> As a bonus, we look forward to fewer violations (exhibit A, B, C) of our strict no LLM / no AI policy,

Hilarious how the offender on "exhibit A" [1] is the same one from the other post that made the frontpage a couple of days ago [2].

[1] https://github.com/ziglang/zig/issues/25974

[2] https://news.ycombinator.com/item?id=46039274

IgorPartola 30 days ago | parent | context | on: AGI fantasy is a blocker to actual engineering

It is ultimately a hardware problem. To simplify it greatly, an LLM neuron is a single input single output function. A human brain neuron takes in thousands of inputs and produces thousands of outputs, to the point that some inputs start being processed before they even get inside the cell by structures on the outside of it. An LLM neuron is an approximation of this. We cannot manufacture a human level neuron to be small and fast and energy efficient enough with our manufacturing capabilities today. A human brain has something like 80 or 90 billion of them and there are other types of cells that outnumber neurons by I think two orders of magnitude. The entire architecture is massively parallel and has a complex feedback network instead of the LLM’s rigid mostly forward processing. When I say massively parallel I don’t mean a billion tensor units. I mean a quintillion input superpositions.

And the final kicker: the human brain runs on like two dozen Watts. An LLM takes a year of running on a few MW to train and several KW to run.

Given this I am not certain we will get to AGI by simulating it in a GPU or TPU. We would need a new hardware paradigm.

c0nsumer 59 days ago | parent | context | on: I almost got hacked by a 'job interview'

I'm regularly asked by coworkers why I don't run my writing through AI tools to clean it up and instead spend a time iterating over it, re-reading, perhaps with a basic spell checker and maybe grammar check.

That's because, from what I've seen to date, it'd take away my voice. And my voice -- the style in which I write -- is my value. It's the same as with art... Yes, AI tools can produce passable art, but it feels soulless and generic and bland. It lacks a voice.

devit on Nov 14, 2019 | parent | context | on: PayPal stops payouts to models on Pornhub

I think the test should be whether another company could offer a service giving the same result.

So for instance this would be fine:

- A web hosting company refusing clients, because you can use another one

- Amazon refusing to sell the item themselves, because other companies can sell on the Amazon marketplace

- A bank refusing a client, since you can just use another bank

This would not be fine:

- Apple refusing to list an app on the app store, since no else can do so

- Amazon refusing to allow an item on its marketplace at all, since there is no other way to sell to people who buy on Amazon

- Google banning something from their search engine, since that's the only way to reach people who search with only Google

- PayPal refusing a client, since using PayPal is the only way to easily accept payment from PayPal account holders

bccdee 76 days ago | parent | context | on: The AI coding trap

> the idea that technology forces people to be careless

I don't think anyone's saying that about technology in general. Many safety-oriented technologies force people to be more careful, not less. The argument is that this technology leads people to be careless.

Personally, my concerns don't have much to do with "the part of coding I enjoy." I enjoy architecture more than rote typing, and if I had a direct way to impose my intent upon code, I'd use it. The trouble is that chatbot interfaces are an indirect and imperfect vector for intent, and when I've used them for high-level code construction, I find my line-by-line understanding of the code quickly slips away from the mental model I'm working with, leaving me with unstable foundations.

I could slow down and review it line-by-line, picking all the nits, but that moves against the grain of the tool. The giddy "10x" feeling of AI-assisted coding encourages slippage between granular implementation and high-level understanding. In fact, thinking less about the concrete elements of your implementation is the whole advantage touted by advocates of chatbot coding workflows. But this gap in understanding causes problems down the line.

Good automation behaves in extremely consistent and predictable ways, such that we only need to understand the high-level invariants before focusing our attention elsewhere. With good automation, safety and correctness are the path of least resistance.

Chatbot codegen draws your attention away without providing those guarantees, demanding best practices that encourage manually checking everything. Safety and correctness are the path of most resistance.

Calavar 80 days ago | parent | context | on: Helium Browser

> Chrome is an excellent browser with leading standards support.

Google learned it can be "standards compliant" if it submits a draft spec to WHATWG/W3C, and while the comment and revision process is still ongoing, roll out those features in Chrome and start using them in YouTube, Gmail, Google docs, and AMP. Now Firefox and Safari are forced to implement those draft specs as well or users will leave in droves because Google websites are broken. Soon enough, Google's draft spec is standardized with minimal revisions because it's already out there in the wild.

The debate, revision, and multistakeholder aspects of the standards process have been effectively bypassed, a la IE6 and ActiveX, but Chrome can claim to be on the cutting edge of standards compliance. This is a case of Goodharts's law.

troupo 80 days ago | parent | context | on: Helium Browser

> Is Google actively sabotaging Mozilla

Oh, Google did sabotage Mozilla: https://archive.is/2019.04.15-165942/https://twitter.com/joh...

SirensOfTitan 83 days ago | parent | context | on: What happens when coding agents stop feeling like ...

> Each of these 'phases' of LLM growth is unlocking a lot more developer productivity, for teams and developers that know how to harness it.

I still find myself incredibly skeptical LLM use is increasing productivity. Because AI reduces cognitive engagement with tasks, it feels to me like AI increases perceptive productivity but actually decreases it in many cases (and this probably compounds as AI-generated code piles up in a codebase, as there isn't an author who can attach context as to why decisions were made).

https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...

I realize the author qualified his or her statement with "know how to harness it," which feels like a cop-out I'm seeing an awful lot in recent explorations of AI's relationship with productivity. In my mind, like TikTok or online dating, AI is just another product motion toward comfort maximizing over all things, as cognitive engagement is difficult and not always pleasant. In a nutshell, it is another instant gratification product from tech.

That's not to say that I don't use AI, but I use it primarily as search to see what is out there. If I use it for coding at all, I tend to primarily use it for code review. Even when AI does a good job at implementation of a feature, unless I put in the cognitive engagement I typically put in during code review, its code feels alien to me and I feel uncomfortable merging it (and I employ similar levels of cognitive engagement during code reviews as I do while writing software).

segmondy 3 months ago | parent | context | on: Many hard LeetCode problems are easy constraint pr...

I was told to use ANY language in an interview. I asked them if they were sure, so I solved it with J. They were not too pleased and asked me if I could use another language, so I did prolog and we moved on to the next question. Then the idiot had the audacity to say I should not use "J and Prolog" but any common known language. I asked if assembly was fine, and they said no. Perhaps python or javascript. I did the rest in python, needless to say I didn't get the job. :-)

Nikolas0 4 months ago | parent | context | on: ADHD drug treatment and risk of negative events an...

My $0.02 as a response to several comments I read in this thread: I was diagnosed with ADHD in my 40s and got Concerta. My belief is that ADHD is not a disease, nor a disability (even though it acts like one very frequently) and in fact there is evidence that ADHD is an important part of our evolution as a species.

The problem(s) mostly relies with the modern way of life and what is expected from the society at large. In that context I try to feel ok when I daydream while I have countless of boring things to take care of as I totally feel ok when I hyperfocus in a creative endeavor.

The meds are just a tool that I use no more than two times per week in order to take better care of myself and others. It is not a therapy and it's not me. I believe that Sensitive Rejection Dysphoria is very real for people like us, but the worst version of it is when you reject yourself because you are different and you try hard to be someone else.

bigstrat2003 4 months ago | parent | context | on: Bluesky: Updated Terms and Policies

I think that Twitter (and by extension, Bluesky) is designed in such a way that it promotes hostility and division. You can't really have a good discussion when the format makes people limit their posting to super short messages; it means people just dump hot takes on each other and wind up shouting past each other. So in that sense I certainly would call it the platforms' fault. Twitter (and Bluesky/Mastodon) are toxic to our society and we would be far better off if they were never created.

AnotherGoodName 5 months ago | parent | context | on: Avoiding the Global Lobotomy

I actually wonder about current co2 levels and concentration.

We’ve roughly doubled co2 in human history. Much of that in the last 100 years alone. They say that measurable drowsiness at 1000ppm and when you consider the atmospheric co2 being well above 400ppm and indoor conditions often more than doubling that i wonder if we’re not going to hit a measurable stupefaction of the world. Perhaps it’s already happening.

habosa 6 months ago | parent | context | on: My AI skeptic friends are all nuts

I’m an AI skeptic. I’m probably wrong. This article makes me feel kinda wrong. But I desperately want to be right.

Why? Because if I’m not right then I am convinced that AI is going to be a force for evil. It will power scams on an unimaginable scale. It will destabilize labor at a speed that will make the Industrial Revolution seem like a gentle breeze. It will concentrate immense power and wealth in the hands of people who I don’t trust. And it will do all of this while consuming truly shocking amounts of energy.

Not only do I think these things will happen, I think the Altmans of the world would eagerly agree that they will happen. They just think it will be interesting / profitable for them. It won’t be for us.

And we, the engineers, are in a unique position. Unlike people in any other industry, we can affect the trajectory of AI. My skepticism (and unwillingness to aid in the advancement of AI) might slow things down a billionth of a percent. Maybe if there are more of me, things will slow down enough that we can find some sort of effective safeguards on this stuff before it’s out of hand.

So I’ll keep being skeptical, until it’s over.

devmor 6 months ago | parent | context | on: Human coders are still better than LLMs

I have been evaluating LLMs for coding use in and out of a professional context. I’m forbidden to discuss the specifics regarding the clients/employers I’ve used them with due to NDAs, but my experience has been mostly the same as my private use - that they are marginally useful for less than one half of simple problem scenarios, and I have yet to find one that has been useful for any complex problem scenarios.

Neither of these issues is particularly damning on its own, as improvements to the technology could change this. However, the reason I have chosen to avoid them is unlikely to change; that they actively and rapidly reduce my own willingness for critical thinking. It’s not something I noticed immediately, but once Microsoft’s study showing the same conclusions came out, I evaluated some LLM programming tools again and found that I generally had a more difficult time thinking through problems during a session in which I attempted to rely on said tools.

AIPedant 7 months ago | parent | context | on: Watching o3 guess a photo's location is surreal, d...

I have one link that illustrates what I mean: https://chatgpt.com/share/6802e229-c6a0-800f-898a-44171a0c7d... The line about "the latitudinal light angle that matches mid‑February at ~47 ° N." seems like pure BS to me, and in the reasoning trace it openly reads the EXIF.

A more clear example I don't have a link for, it was on Twitter somewhere: someone tested a photo from Suriname and o3 said one of the clues was left-handed traffic. But there was no traffic in the photo. "Left-handed traffic" is a very valuable GeoGuesser clue, and it seemed to me that once o3 read the Surinamese EXIF, it confabulated the traffic detail.

It's pure stochastic parroting: given you are playing GeoGuesser honestly, and given the answer is Suriname, the conditional probability that you mention left-handed traffic is very high. So o3 autocompleted that for itself while "explaining" its "reasoning."

bjackman 10 months ago | parent | context | on: Forgejo: A self-hosted lightweight software forge

The fundamental issue is that GitHub thinks the artifact you are reviewing is code. But the artifact I want to review is a series of commits. I make review comments like "please split this into a separate commit", "please add info XYZ to the commit message".

GitHub doesn't offer any way to review changes to those things. If the author force-pushes (which is normal and healthy if you are iterating on a series of patches, instead of on a blob of code) there's no way to diff the details I want to look at.

Compare Gerrit where for each individual commit you can diff between two versions of that commit, with a side-by-side UI showing the comments inline that the changes were made in response to.

From speaking to friends I believe this is because "why would you force-push? Just push a new commit called 'respond to review comments'". When I said "but now your commit log is a mess" they say "no, you just squash the whole PR into a single commit when you merge it. So... yes, your commit log is a mess. Bear in mind there is also no support for dependencies between PRs. So basically, you are throttled to one in-flight PR per area of work at a time. So... your commits are gonna end up being huge. Not really viable for a large project.

I have noticed that there are major projects like k8s underway on Github and they seem to get by, so maybe I'm missing something. I know that Go allow PRs but if you wanna do serious contributions they will funnel you to contribute via Gerrit instead.

HansHamster on Oct 14, 2023 | parent | context | on: F-Droid version of KDEConnect uninstalled by PlayP...

LineageOS unfortunately dropped support for my Moto G4 relatively quickly after I installed it and it only was supported up to Android 7.1. I have been running an unofficial build of 8.1 ever since, but that is also horribly outdated by now.

cmrdporcupine 8 months ago | parent | context | on: Talkin’ about a Revolution

Honestly, the way I see it: The people behind the 2025 stuff etc have been on the "losing" side of the culture war since 1969.

And they're sick of it, they're desperate, and now are just laying their cards on the table.

I grew up in an evangelical church in the 80s-- albeit in Canada -- attending "Focus on the Family" events, Bible studies, etc. that promulgated heavy socially conservative ethos -- so I feel like I have seen this narrative play out over a few decades ...

After the legalization of gay marriage they just collectively lost their shit. They see the stakes as being incredibly high. They see abortion as straight up murder. Winning the 2016 Trump presidency and taking over the US supreme court gave them a taste of blood, and a sense that they can finally reverse what they see as a profound descent into degeneracy. The trans rights stuff over the last few years has them totally incensed, as its a full-on assault (to them) on the ontological reality of family, body, identity, etc. that they consider intrinsic and holy and fundamental.

I think they're full of shit, but that's I think how this world view shakes down. It's a war because they feel the stakes are incredibly high.

People on the right or in boardrooms of various companies that are aligning themselves with these people for what they see are strategic ends are playing with fire.

kittikitti 9 months ago | parent | context | on: You're not a senior engineer until you've worked o...

I prefer a flat structure where you're a junior developer if you're still in training. I've held positions where my title was senior while I worked on a few projects, both legacy and otherwise. However, I think labels like these are harmful and create an atmosphere of authority and gatekeeping. I still learn new and exciting things from everyone, including junior associates. On the other hand, I know people who label themselves as senior who could be called junior.

After enough time, none of these HR provided labels matter and the work you do signals your level, not the other way around. Just a fair bit of warning though, regardless of your level, you should know how to implement the solutions from the ground up without corporate support and IT setting everything up for you.

gloosx 10 months ago | parent | context | on: Tell HN: Cloudflare is blocking Pale Moon and othe...

I'm still using Mozilla/5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5376e Safari/8536.25 on my desktop.

The internet is so much better like this! There is a 2010 lightweight mobile version of Google, and m.youtube with obviously cleaner and better UI and not a single ad (apparently it's not worth to show you ads if you still appear to be using iphone 6)

33a on May 24, 2024 | parent | context | on: 2D Rigid Body Collision Resolution

Posted a reply here https://news.ycombinator.com/item?id=40466855

This is a specific reference on how constraints model contact between rigid bodies https://box2d.org/files/ErinCatto_UnderstandingConstraints_G...

Most games since Half Life 2 use constraint forces like this to solve collisions. Springs/penalty forces are still used sometimes in commercial physics solvers since they're easier to couple with other simulations, but they require many small timesteps to ensure convergence.

xeonmc on May 3, 2024 | parent | context | on: Vanguard just went live and LoL players are claimi...

It really did not.

The blog described insufficiency of the prior detection method, and statistics of the rate of cheating. It did not address the point that the game, by inherited design (of the genre), already places a ceiling on the effectiveness of automated actions.

What's particularly irksome about the blog, besides the gratuitous language throughout, are expedient omissions of context in statements such as:

> When piloted optimally, scripter win rates hover around 80% in Ranked games

without contextualizing what elo it's measured at or how it compares to a smurf which you'd also expect to have an anomalous winrate before reaching their equilibrium elo. This is then immediately followed by the discussion of the cheating statistics for above-Masters tier, misleading the uncareful reader to draw the conclusion that cheaters can manage 80% winrate in Masters+.

One might be inclined to dismiss this as carelessness in writing, however these kinds of patterns are common throughout Riot's various "technical blogs", especially those posts on valorant's initial release that claimed some earthshattering netcode innovation, then described various bog-standard multiplayer practices without further elaboration, implicitly passing them off as their own invention without actually claiming so.

I'd suggest others here to also spot these patterns for themselves by taking a read at some of these other riot blogposts masquerading as technical deep dives that really are self-promotion aimed to beguile "gamers".

bryanlarsen on June 26, 2023 | parent | context | on: Ask HN: Why did Nim not catch on like Rust did?

> If Mozilla had created and pushed Nim instead, would it be as popular as what they did for Rust?

In my opinion, definitely not. I knew people excited about and working in Rust back in ~2016 when very few people knew about Rust. They proselytized me, I converted others, et cetera. It's all grassroots spread and has nothing to do with Mozilla marketing.

And that grassroots marketing spread rapidly and easily because of Rust's killer unique features.

rjbwork on March 14, 2024 | parent | context | on: Winning a hackathon, losing my sanity

Cool project and write up.

An aside - while I love the snark and making fun of these "legacy" systems, it has given me a window into my own maturity as an engineer. I was absolutely this cavalier and cocky about poorly implemented systems I've been a user or admin of in the past. But having now spent nearly a decade and a half getting paid for this work and seeing a lot of stuff and the evolution of best practices, I have much more empathy for the organizations and authors of these systems. There are very very few programs that ever achieve something like elegance and beauty when they collide with the real world.

loup-vaillant on Feb 28, 2024 | parent | context | on: Nintendo is suing the creators of Switch emulator ...

Indeed it isn't: digital goods are non-rival, and as such fundamentally un-steal-able. Property as we usually understand it simply doesn't apply to them. There's still copyright infringement, but "unlawful breach of a state-granted monopoly" doesn't sound nearly as bad as stealing.

You wouldn't steal a car, would you?

YoumuChan on Feb 26, 2024 | parent | context | on: Firsty.app – free 300kbit/s eSIM for US/EU

It is just an standard eUICC card with an issuer certificate, which means you need issuer's app to access low-level eUICC functions on a rootless Android. This is how esim.me enforces the subscription.

This also means, you can use any LPA implementation to manage and install profiles on your own!

Some examples:

https://github.com/Truphone/LPAdesktop Needs a smart card reader and a pc to work

https://github.com/estkme-group/lpac Could either use a smart card reader or an actual modem with AT-support

https://gitea.angry.im/PeterCxy/OpenEUICC Needs a root on Android

Furthermore, I believe you could manage it via Windows settings if your window laptop has a WWAN card.

p-e-w on Dec 11, 2023 | parent | context | on: Omg.lol: An Oasis on the Internet

> The true magic of the early web was somebody genius but decidedly untechnical like David Bowie shitposting at his own fans.

No, the magic of the early web was that people treated their online identities as a secret alternative life, rather than a resume for recruiters, friends, potential partners, and other real-world acquaintances to look at.

The Internet of today is little more than a (distorted) mirror of people's offline lives. That's why the problems of today's Internet are the same as the problems of the real world. By contrast, the Internet of the 90s was an exciting world of its own, with rules that were dramatically different from those of everyday life.

Sterm on July 23, 2023 | parent | context | on: Man found guilty of child porn because he ran a To...

This is the point where you need to check your privilege. I used tor when living in a dictatorship to find out things which would destroy the moral fabric of society, such as information about lgbtqia+ issues, what condoms are, pop music and news that the government didn't want to spread.