Hacker Newsnew | past | comments | ask | show | jobs | submit | tyre's commentslogin

I want some answers that Ja Rule might not have right now

thank you, I did a double take. Drugtrio is my new favorite Pokémon

I know we’re not supposed to make comments that don’t contribute anything, but that’s really hckin cool.

* have mercy on me dang


> that’s really hckin cool.

Not during reentry it’s not.


imo their competition for autonomous vehicles doesn’t come from car companies, but from tech companies.

Amazon has a lucrative incentive to automate its supply chain up to and including last mile delivery. Waymo has proven out the tech and could easily partner with Uber or Lyft for the rider experience and reach.

If you’re FedEx, for example, would you rather buy from Amazon or from Tesla? Who is more likely to be a sane and trustworthy partner?


I don't think that Uber or Lyft are going to invest in self-driving taxis. The capital model is completely different: Uber and Lyft are by design capital light, they own nothing more than the software (1), and someone needs to buy all of these self-driving machines and then someone needs to maintain them, whereas their current model doesn't do that- they can't offer that to any tech partner.

The reason that you don't see more Waymo areas has nothing to do with rider pool or experience, it is because their tech requires pre-mapping everything with LiDAR several times- the advantage is that if you know what is static (because it was in all of that LiDAR mapping) then a simple difference algo can tell you everything that is dynamic in the environment. (Also, they are just starting to hit cities with significant precipitation- SFO, LA, ATX, PHX are all pretty dry cities, they are going into ATL, MIA, DC, DEN, etc.)

1: With a lot of suspicion that much of their profit comes from drivers not understanding depreciation of their vehicles, something that the accountants who work for Uber and Lyft will understand very very well.


Uber, and to a lesser extent Lyft, has been an extremely prolific investor in the autonomous vehicle space. They're absolutely paying attention to it.

Similarly, Waymo isn't bottlenecked by mapping or rain. I've seen enough of them testing in Seattle and Tokyo, as examples.


Yeah, and then Uber sold off its self driving research team.

I'm talking about after, of course. They retained a massive investment in Aurora as part of that deal. They invested in Waabi not long after, then Nuro, Avride and started partnerships with Waymo, Motional, and others.

Uber spent billions trying to make self-driving work, until they gave up. Not "by design".

The benefit of having control is that they can adapt them to their priorities. Similar Apple designing its own chips when there were already viable producers in the market.

They won’t need to rely on others prioritizing their priorities, like low volume, high cost early investments in batteries designed for a market (humanoid robots) that doesn’t exist.

If they then scale them up, they also have the benefit that there is no 3p supplier who can turn around and sell those to a competitor.


It’s also a way of getting people to read things about the subject that they otherwise wouldn’t. I read a lot of philosophy because it was relevant to a paper I was writing, but wasn’t assigned to the entire class.

Does it matter? They found 12 vulnerabilities. Clearly there was enough signal:noise that they could uncover these as real.

It doesn't look like they had 1 AI run for 20 minutes and then 30 humans sift through for weeks.


Does it matter?

Yes, we have been on the receiving end of AI generated bug reports and in the vast majority of cases they are really bad. But you still need humans to sift through them. And when you ask the submitter questions, it’s often clear that they just give the questions to an LLM again to answer.

It costs a huge amount of human manpower, so if the company who made this had an AI based solution with a far lower false-positive rate, that would be great.


> It doesn't look like they had 1 AI run for 20 minutes and then 30 humans sift through for weeks.

It does, though, look like they were running their AI over the codebase for an extended period of time (not per run, but multiple runs over the period of a year)

> Does it matter?

Hell yes, false reports are the bane of the bug bounty industry.


I wish your recent interview had pushed much harder on this. It came across as politely not wanting to bring up how poorly this really went, even for what the engineer intended.

They were making claims without the level of rigor to back them up. There was an opportunity to learn some difficult lessons, but—and I don’t think this was your intention—it came across to me as kind of access journalism; not wanting to step on toes while they get their marketing in.


pushing would definitely stop the supply of interviews/freebies/speaking engagements

The person you're responding to isn't a journalist, they're a mouthpiece. Pushing means they don't get these interviews anymore.

The quality of whatever they put out as a result of it is yours to take into consideration.


Why would he push back? His whole schtick is to sell only AI hype. He’s not going to hurt his revenue.

If I sell only AI hype why do I keep telling people that many systems built on top of LLMs are inherently insecure? https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/

That's a great way to tell on yourself that you've never read Simon's work.

On the contrary, we get to read hundreds of his comments explaining how the LLM in anecdote X didn't fail, it was the developer's fault and they should know better than to blame the LLM.

I only know this because on occasion I'll notice there was a comment from them (I only check the name of the user if it's a hot take) and I ctrl-F their username to see 20-70 matches on the same thread. Exactly 0 of those comments present the idea that LLMs are seriously flawed in programming environments regardless of who's in the driver seat. It always goes back to operator error and "just you watch, in the next 3 months or years...".

I dunno, I manage LLM implementation consulting teams and I will tell you to your face that LLMs are unequivocally shit for the majority of use cases. It's not hard to directly criticize the tech without hiding behind deflections or euphemisms.


> Exactly 0 of those comments present the idea that LLMs are seriously flawed in programming environments regardless of who's in the driver seat.

Why would I say that when I very genuinely believe the opposite?

LLMs are flawed in programming environments if driven by people who don't know how to use them effectively.

Learning to use them effectively is unintuitive and difficult, as I'm sure you've seen yourself.

So I try to help people learn how to use them, through articles like https://simonwillison.net/2025/Mar/11/using-llms-for-code/ and comments like this one: https://news.ycombinator.com/item?id=46765460#46765940

(I don't ever say variants of "just you watch, in the next 3 months or years..." though, I think predicting future improvements is pointless when we can be focusing on what the models we have right now can do.)


I literally see their posts every (other) day, and its always glazing something that doesn't fully work (but is kind of cool at a glance) or is really just hyped beyond belief.

Comments usually point out the issues or more grounded reality.

BTW I'm bullish on AI, going through 100s of millions of tokens per month.


the bare minimum of criticism to allow independence to be claimed?

I actually don't think this is true, and certainly of people who cover LLMs Simon Willison is one of the more critical and measured people.

I just don't think that's the case.

The claims they made really weren't that extreme. In the blog post they said:

> To test this system, we pointed it at an ambitious goal: building a web browser from scratch. The agents ran for close to a week, writing over 1 million lines of code across 1,000 files. You can explore the source code on GitHub.

> Despite the codebase size, new agents can still understand it and make meaningful progress. Hundreds of workers run concurrently, pushing to the same branch with minimal conflicts.

That's all true.

On Twitter their CEO said:

> We built a browser with GPT-5.2 in Cursor. It ran uninterrupted for one week.

> It's 3M+ lines of code across thousands of files. The rendering engine is from-scratch in Rust with HTML parsing, CSS cascade, layout, text shaping, paint, and a custom JS VM.

> It kind of works! It still has issues and is of course very far from Webkit/Chromium parity, but we were astonished that simple websites render quickly and largely correctly.

That's mostly accurate too, especially the "it kind of works" bit. You can take exception to "from-scratch" claim if you like. It's a tweet, the lack of nuance isn't particularly surprising.

In the overall genre of CEO's over-hyping their company's achievements this is a pretty weak example.

I think the people making out that Cursor massively and dishonestly over-hyped this are arguing with a straw man version of what the company representatives actually said.


> That's mostly accurate too, especially the "it kind of works" bit. You can take exception to "from-scratch" claim if you like. It's a tweet, the lack of nuance isn't particularly surprising.

> In the overall genre of CEO's over-hyping their company's achievements this is a pretty weak example

I kind of agree, but kind of not. The tweet isn't too bad when read from an experienced engineer perspective, but if we're being real then the target audience was probably meant to be technically clueless investors who don't and can't understand the nuance.


What people take issue with is the claim that agents built a web browser "from scratch" only to find by looking deeper that they were using Servo, WGPU, Taffy, winit, and other libraries which do most of the heavy lifting.

It's like claiming "my dog filed my taxes for me!" when in reality everything was filled out in TurboTax and your dog clicked the final submit button. Technically true, but clearly disingenuous.

I'm not saying an LLM using existing libraries is a bad thing--in fact I'd consider an LLM which didn't pull in a bunch of existing libraries for the prompt "build a web browser" to be behaving incorrectly--but the CEO is misrepresenting what happened here.


Did you read the comment that started this thread? Let me repeat that, ICYMI:

> "So I agree this isn't just wiring up of dependencies, and neither is it copied from existing implementations: it's a uniquely bad design that could never support anything resembling a real-world web engine."

It didn't use Servo, and it wasn't just calling dependencies. It was terribly slow and stupid, but your comment is more of a mischaracterization than anything the Cursor people have said.


You're right in the sense it didn't `use::servo`, merely Servo's CSS parser `cssparser`[0] and Servo's DOM parser `html5ever`[1]. Maybe that dog can do taxes after all.

[0] https://github.com/search?q=repo%3Awilsonzlin%2Ffastrender%2...

[1] https://github.com/search?q=repo%3Awilsonzlin%2Ffastrender+h...


Taffy is related to Servo too, though apparently not officially part of the Servo project - but Servo does use it.

https://github.com/DioxusLabs/taffy

Used here (I think): https://github.com/servo/servo/tree/c639bb1a7b3aa0fd5e02b40d...


Servo uses Taffy for CSS Grid. It could also very easily use it for Flexbox, but they currently prefer to use their own implementation there.

It was originally a derivative of React Native's Yoga implementation of Flexbox, and is currently developed primarily as part of the Blitz engine.


I agree that "from scratch" is a misrepresentation.

But it was accompanied by a link to the GitHub repo, so you can hardly claim that they were deliberately hiding the truth.


Sorry, just to be clear, the defense that they pulled something out of their ass is that they linked to something that outed them? So they couldn't have actually have been overstating it?

If anything, that proves the point that they weren't rigorous! They claimed a thing. The thing didn't accomplish what they said. I'm not saying that they hid it but that they misrepresented the thing that they built. My comment to you is that the interview didn't directly firmly pressure them on this.

Generating a million lines of code in parallel isn't impressive. Burning a mountain of resources in parallel isn't noteworthy (see: the weekly post of someone with an out of control EC2 instance racking up $100k of charges.)

It would have been remarkable if they'd built a browser from scratch, which they said they did, except they didn't. It was a 50 million token hackathon project that didn't work, dressed up as a groundbreaking example of their product.

As feedback, I hope in the future you'll push back firmly on these types of claims when given the opportunity, even if it makes the interviewee uncomfy. Incredible claims require incredible evidence. They didn't have it.


My goal in the interview was to get to as accurate a version of what they actually built and how they built it as possible.

I don't think directly accusing them of being misleading about what they had done would have supported that goal, so I didn't do it.

Instead I made sure to dig into things like what QuickJS was doing in there and why it used Taffy as part of the conversation.


3 days ago: (https://news.ycombinator.com/item?id=46743831)

> Honestly, grilling him about what the CEO had tweeted didn't even cross my mind.

Today:

> I don't think directly accusing them of being misleading about what they had done would have supported that goal, so I didn't do it.

I find it hard to follow how it didn't cross your mind while for the same interview you had also considered the situation and determined it didn't meet the interview goal.


I don't think those two statements are particularly inconsistent.

It didn't cross my mind to grill him over his CEO's tweets.

I also don't think that directly accusing them of being misleading would support the goal of my interview - which was to figure out the truth of what they built and how.

If you like, I'll retract the fragment "so I didn't do it" since that implies that I thought "maybe I should grill him about what the CEO said... no actually I won't" - which isn't what happened.

So I guess you win?


> I agree that "from scratch" is a misrepresentation.

I believe in the UK the term for this is actually fraudulent misrepresentation:

https://en.wikipedia.org/wiki/Misrepresentation#English_law

And in this context it seems to go against The Consumer Protection from Unfair Trading Regulations 2008 and the Digital Markets, Competition and Consumers Act 2024:

https://www.legislation.gov.uk/uksi/2008/1277/made

https://www.legislation.gov.uk/ukpga/2024/13/section/226


I very much don't believe for a second anyone would manage to get a judgement against them on this in the UK.

For starters, the language is highly subjective, and they'd be able to show vast amounts of discourse about software engineering where "from scratch" often does not involve starting with nothing, and they'd then go on to argue that the person suing haven't actually had any reason to believe that they would be able to replicate a setup that was described as a complex large-scale experiment without much more information.

The person suing would have an uphill battle showing that whatever assumptions they made were something that was reasonable to infer based on that statement.

And to have a case, a consumer would also then need to have relied on this as a significant factor in choosing to buy their services.

But even if we assume the court would agree it is fraudulent, the remedy is only "directly consequential losses".

In other words, I doubt anyone would lose sleep over this risk.


How many non developers were going to look at that? They knew exactly what they were doing by saying that.

> But it was accompanied by a link to the GitHub repo, so you can hardly claim that they were deliberately hiding the truth.

Well, yes and no; we live in an era where people consume headlines, not articles, and certainly not links to Github repositories in articles. If VCs and other CEOs read the headline "Cursor Agents Autonomously Create Web Browser From Scratch" on LinkedIn, the project has served its purpose and it really doesn't matter if the code compiles or not.


> I think the people making out that Cursor massively and dishonestly over-hyped this are arguing with a straw man version of what the company representatives actually said.

It's far more dishonest to search for contrived interpretations of their statements in an attempt to frame them as "mostly accurate" when their statements are clearly misleading (and in my opinion, intentionally so).

You're giving them infinite benefit of the doubt where they deserve none, as this industry is well known for intentionally misleading statements, you're brushing off serious factual misrepresentations as simple "lack of nuance" and finally trying to discredit people who have an issue with all of this.

With all due respect, that's not the behavior of a neutral reporter but someone who's heavily invested in maintaining a certain narrative.


According to the twitter analytics you can see on the post (at least on nitter), the original

> We built a browser with GPT-5.2 in Cursor. It ran uninterrupted for one week.

tweet was seen by over 6 million people.

The follow up tweet which includes the link to the actual details was seen by less than 200000.

That's just how Twitter engagement works and these companies know it. Over 6 million people were fed bullshit. I'm sorry, but it's actually a great example of CEOs over hyping their products.


That Tweet that was seen by 6 million people is here: https://x.com/mntruell/status/2011562190286045552

You only quoted the first line. The full tweet includes the crucial "it kind of works" line - that's not in the follow-up tweet, it's in the original.

Here's that first tweet in full:

> We built a browser with GPT-5.2 in Cursor. It ran uninterrupted for one week.

> It's 3M+ lines of code across thousands of files. The rendering engine is from-scratch in Rust with HTML parsing, CSS cascade, layout, text shaping, paint, and a custom JS VM.

> It kind of works! It still has issues and is of course very far from Webkit/Chromium parity, but we were astonished that simple websites render quickly and largely correctly.

The second tweet, with only 225,000 views, was just the following text and a link to the GitHub repository:

> Excited to continue stress testing the boundaries of coding agents and report back on what we learn.

> Code here: https://github.com/wilsonzlin/fastrender


The fact that the codebase is meaningless drivel has already been established, you don’t need to defend them. It’s just pure slop, and they’re trying to get people to believe that it’s a working browser. At the time he bragged about that `cargo build` didn’t even run! It was completely broken going back a hundred commits. So it was a complete lie to claim that it “kind of works”.

You have a reputation. You don’t need to carry water for people who are misleading people to raise VC money. What’s the point of you language lawyering about the precise meaning of what he said?

“No no, you don’t get it guys. I’m technically right if you look at the precise wording” is the kind of silly thing I do all the time. It’s not that important to be technically right. Let this one go.


Which part of their CEO saying "It kind of works" are you interpreting as "trying to get people to believe that it’s a working browser"?

The reason I won't let this one go is that I genuinely believe people are being unfair to the engineer who built this, because some people will jump on ANY opportunity to "debunk" stories about AI.

I won't stand for misleading rhetoric like "it's just a Servo wrapper" when that isn't true.


> I won't stand for misleading rhetoric like "it's just a Servo wrapper" when that isn't true.

this level of outrage seems absent when it's misleading in the pro-"AI" direction


> "It kind of works"

https://github.com/wilsonzlin/fastrender/issues/98

A project that didn't compile at all counts as "kind of" working now?

> I won't stand for misleading rhetoric like "it's just a Servo wrapper" when that isn't true.

True, at least if it was a wrapper then it would actually kind of work, unlike this which is the most obvious case of hyping lies up for investors I've witnessed in the last... Well, week or so, considering how much bullshit spews out of the mouths of AI bros.


It did compile. It just didn't compile in GitHub Actions CI, since that wasn't correctly configured.

The linked GitHub issue has quotes from multiple people who were not able to compile it locally, not just in CI.

simonw has drunk the koolaid on this one. There’s no point trying to convince him. Relatedly, he made a prediction that AI would be able to write a web browser from scratch in 3 years. He really wants to see this happen, so maybe that’s why he’s defending these scammers.

It’s been fascinating, watching you go from someone who I could always turn into more sensible opinion about technology for the last 15 years, to a sellout whose every message drips with motivated reasoning.

I feel like I spend way too much of my time arguing back against motivated reasoning from people.

"This project is junk that doesn't even compile", for example.


It's largely futile. There's a certain contingent that will not be convinced of this until they see what these tools can do first hand, and they'll refuse to try to do this properly until it's everywhere.

Well I have watched the show adaptation of Shogun, which features authentic Japanese language, and enjoy the occasional Omakase (in Brooklyn), so I’d say I’m pretty qualified to comment on Japanese rail over the past sixty years.

I've managed to draw the Japan flag in middle school one time. Add me to the list of reputable sources.

I’ve read the Wikipedia article about Japan and had a friend living there. Beat that!

I grew up playing all the Mario games and wrote a dissertation on an Internet forum, so now I have a PhD in both Japanese and Italian culture!

And the other one was, as far as I remember, likely deliberate based on the pilot’s flight simulation data.

That one doesn't reflect well on the airline IMO. There should be systems in place to help employees cope with mental health issues so that they don't end up hijacking their own plane.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: