Hacker Newsnew | past | comments | ask | show | jobs | submit | catmanjan's commentslogin

Imagine how many millions would drive a Ferrari if they gave them away for free

> Imagine how many millions would drive a Ferrari if they gave them away for free

0.

Ferrari is a luxury sports brand. What's the point of it if it flooded the streets?


Good looking, powerful, reliable car?

> Good looking, powerful, reliable car?

How to say you don't own a Ferrari without saying you don't own a Ferrari.


It’s not true that people are only using it because it’s free.

It’s actually quite interesting to see these contradictory positions play out:

1. LLMs are useless and everyone is making a stupid bet on it. The users of llms are fooled into using it and the companies are fooled into betting on it

2. Llms are getting so cheap that the investments into data centers won’t pay off because apparently they will get good enough to run on your phone

3. Llms are bad and they are bad for environment, bad for the brain, bad because they displace workers and bad because they make rich people richer

4. AI is only kept up because there’s a conspiracy to keep it propped up by Nvidia, oracle, OpenAI (something something circular economy)

5. AI is so powerful that it should not be built or humanity would go extinct


It is true that none of the LLM providers are profitable though, so there is some number above free that they need to charge and I am not convinced that number is compelling

None of LLM providers being profitable is exactly the situation I would expect. Them being profitable is so absurd on the contrary! Why wouldn't they put the money back into R&D and marketing?

I'm not well versed with the accountant terminology, whatever the word is to describe the operating cost, I am not convinced consumers will ever pay enough to cover those costs

Do you think if LLM's become 10 times more efficient it might covert he costs? What efficiency increase would you think is enough?

It's a competitive environment, no way the data centers manage to capture that 10x efficiency improvement. There would be an expectation of 10x reduced prices, because someone else is offering that.

The problem I see as someone who has implemented a bunch of AI solutions in a range of markets, the quality isn't good enough yet to even think about efficiency - even if the current AI is 100x more efficient it still wouldn't be worth paying for because it doesn't deliver reliable and trustable results...

A) Huge straw man, since it isn't the same people making those points. None of those need the other to be true to cause issues, they are independent concerns.

B) You're missing a few things like:

1. The hardware overhang of edge compute (especially phones) may make the centralized compute investments irrelevant as more efficient LLMs (or whatever replaces them) are released.

2. Hardware depreciates quickly. Are these massive data centers really going to earn their money back before a more efficient architecture makes them obsolete? Look at all the NPUs on phones which are useless with most current LLMs due to insufficient RAM. Maybe analogue compute takes off, or giant FPGAs, which can do on a single board what is done with a rack at the moment. We are nowhere near a stable model architecture, or stable optimal compute architecture. Follow the trajectory of bitcoin and etherium mining here to see what we can expect.

3. How does one company earn back their R&D when the moment it is released, competition puts out comparable models within 6 months, possibly by using the very service that was provided to generate training data.


In this scenario Copilot is performing RAG, so the auditing occurs when Copilot returns hits from the vector search engine its connected to - it seems there was a bug where it would only audit when Copilot referenced the hits in its result.

The correct thing to do would be to have the vector search engine do the auditing (it probably already does, it just isn't exposed via Copilot) because it sounds like Copilot is deciding if/when to audit things that it does...


As someone else mentioned the file isnt actually accessed by copilot, rather copilot is reading the pre-indexed contents of the file in a search engine...

Really Microsoft should be auditing the search that copilot executes, its actually a bit misleading to be auditing the file as accessed when copilot has only read the indexed content of the file, I don't say I've visited a website when I've found a result of it in Google


Oh, so there's a complete copy (or something that can be reassembled into a copy) completely OUTSIDE of audit controls. That's so much worse. :0


It's roughly the same problem as letting a search engine build indexes (with previews!) of sites without authentication. It's kinda crazy that things were allowed to go this far with such a fundamental flaw.


Yep. Many years ago I worked at one of the top brokerage houses in the United States, they had a phenomenal Google search engine in house that made it really easy to navigate the whole company and find information.

Then someone discovered production passwords on a site that was supposed to be secured but wasn’t.

Found such things in several places.

The solution was to make searching work only if you opted-in your website.

After that internal search was effectively broken and useless.

All because a few actors did not think about or care about proper authentication and authorization controls.


I'm unclear on what the "flaw" is - isn't this precisely the "feature" that search engines provide to both sides and that site owners put a ton of SEO effort into optimizing?


If you have public documents, you can obviously let a public search engine index them and show previews. All is good.

If you have private documents, you can't let a public search engine index and show previews of those private documents. Even if you add an authentication wall for normal users if they try to open the document directly. They could still see part of the document in google's preview.

My explanation sounds silly because surely nobody is that dumb, but this is exactly what they have done. They gave access to ALL documents, both public and private, to an AI, and then got surprised when the AI leaked some private document details. They thought they were safe because users would be faced with an authentication wall if they tried to open the document directly. But that doesn't help if copilot simply tells you all the secret in it's own words.


You say that, but it happens — "Experts Exchange", for example, certainly used to try to hide the answers from users who hadn't paid while encouraging search engines to index them.


That's not quite the same. Experts Exchange wanted the content publicly searchable, and explicitly allowed search engines to index it. In this case, many customers probably aren't aware that there is a separate search index that contains much of the data in their private documents that may be searchable and accessible by entities that otherwise shouldn't have access.


That's not necessarily what happened in the article. He wasn't able to access private docs. He was just able to tell Copilot to not send an audit log.


> Really Microsoft should be auditing the search that copilot executes, its actually a bit misleading to be auditing the file as accessed when copilot has only read the indexed content of the file, I don't say I've visited a website when I've found a result of it in Google

Not my domain of expertise, but couldn't you at some point argue that the indexed content itself is an auditable file?

It's not literally a file necessarily, but if they contain enough information that they can be considered sensitive, then where is the significant difference?


Not only could you do that, you should do that.


That makes sense on a technical level, but from a security and compliance perspective, it still doesn't really hold up


Usage of Ai's almost by definition need everything indexed at all times to be useful, letting one rummage through your stuff without 100% ownership is just madness to begin with and avoiding deep indexing would make the shit mostly useless unless regular permission systems were put in (and then we're kinda back at were we were without AI's).


> I don't say I've visited a website when I've found a result of it in Google

I mean, it depends on how large the index window is, because if google returned the entire webpage content without leaving (amp moment), you did visit the website. fine line.


The challenge then is to differentiate between "I wanted to access the secret website/document" and "Google/Copilot gave me the secret website/document, but it was not my intention to access that".


Access is access. Regardless of whether you intended to view the document, you are now aware of its content in either case, and an audit entry must be logged.


Strongly agree. Consider the case of a healthcare application where, during the course of business, staff may perform searches for patients by name. When "Ada Lovelace" appears even briefly in the search results of a "search-as-you-type" for some "Adam _lastname", has their privacy has been compromised? I think so, and the audit log should reflect that.

I'm a fan of FHIR (a healthcare api standard, but far from widely adopted), and they have a secondary set of definitions for Audit log patterns (BALP) that recommends this kind of behaviour. https://profiles.ihe.net/ITI/BALP/StructureDefinition-IHE.Ba...

"[Given a query for patients,] When multiple patient results are returned, one AuditEvent is created for every Patient identified in the resulting search set. Note this is true when the search set bundle includes any number of resources that collectively reference multiple Patients."


What's the solution then? Chain 2 AIs, the first one is fine tuned on / has RAG access to your content telling a second that actually produces content what files are relevant (and logged)?

Or just a system prompt "log where all the info comes from"...


Someone please confirm my idea (or remedy my ignorance) about this rule of thumb:

Don't train a model on sensitive info, if there will ever be a need for authZ more granular than implied by access to that model. IOW, given a user's ability to interact w/ a model, assume that everything it was trained on is visible to that user.


I'm pretty sure what you're describing is the fact that Microsoft return Graph scopes by default when you request a token, I agree it is very annoying and only really documented if you read between the lines...


How can anyone still believe the AGI scam


If you think the possibility of AGI within 7-10 years is a scam then you aren't paying attention to trends.


I wouldn't call 7-10 years a scam, but I would call it low odds. It is pretty hard to be accurate on predictions of a 10 year window. But I definitely think 2027 and 2030 predictions are a scam. Majority of researchers think it is further away than 10 years, if you are looking at surveys from the AI conferences rather than predictions in the news.


The thing is, AI researchers have continually underestimated the pace of AI progress

https://80000hours.org/2025/03/when-do-experts-expect-agi-to...

>One way to reduce selection effects is to look at a wider group of AI researchers than those working on AGI directly, including in academia. This is what Katja Grace did with a survey of thousands of recent AI publication authors.

>In 2022, they thought AI wouldn’t be able to write simple Python code until around 2027.

>In 2023, they reduced that to 2025, but AI could maybe already meet that condition in 2023 (and definitely by 2024).

>Most of their other estimates declined significantly between 2023 and 2022.

>The median estimate for achieving ‘high-level machine intelligence’ shortened by 13 years.

Basically every median timeline estimate has shrunk like clockwork every year. Back in 2021 people thought it wouldn't be until 2040 or so when AI models could look at a photo and give a human-level textual description of its contents. I think is reasonable to expect that the pace of "prediction error" won't change significantly since it's been on a straight downward trend over the past 4 years, and if it continues as such, AGI around 2028-2030 is a median estimate.


> "Back in 2021 people thought it wouldn't be until 2040 or so when AI models could look at a photo and give a human-level textual description of its contents."

Claim doesn't check out; here's a YouTube video from Apple uploaded in 2021, explaining how to enable and use the iPhone feature to speak a high level human description of what the camera is pointed at: https://www.youtube.com/watch?v=UnoeaUpHKxY


Exactly. There’s one guy - Ray Kurzweil - who predicted in late 90s that AGI will happen in 2029 (yes, the exact year, based on his extrapolations of Moore’s law). Everybody laughed at him, but it’s increasingly likely he’ll be right on the money with that prediction.


Remember when he said nanobots would cure all our diseases by a few years ago?


I don’t actually, did he specify a year? He made a lot of predictions, many are wrong, but the AGI one is pretty amazing.


2020s was my understanding; he made this prediction around the time that he made the AGI one. I think he has recently pushed it back to 2030s because it seems unlikely to come true.


  > many are wrong, but the AGI one is pretty amazing.
If you make enough predictions eventually one will be right

Or at least one will be exciting


Those are differences of magnitude, AGI is a difference of kind.

No amount of describing pictures in natural language is AGI.


I never said it was sufficient for AGI, just that it was a milestone in AI that people thought was farther off than it turned out to be. This is applying to all subsets of intelligence AI is reaching earlier than experts initially predicted, giving good reason AGI (perhaps a synthesis of these elements coming together in a single model, or a suite of models) is likely closer than standard expert consensus.


The milestones your citing are all milestones of transformers that were underestimated.

If you think an incremental improvement in transformers are what's needed for AGI, I see your angle. However, IMO, transformers haven't shown any evidence of that capability. I see no reason to believe that they'd develop that with a bit more compute or a bit more data.


It's also worth pointing out that in the same survey it was well agreed upon that success would come sooner if there was more funding. The question was a counterfactual prediction of how much less progress would be made if there was 50% less funding. The response was about 50% less progress.

So honestly, it doesn't seem like many of the predictions are that far off with this in context. That things sped up as funding did too? That was part of the prediction! The other big player here was falling cost of compute. There was pretty strong agreement that if compute was 50% more expensive that this would result in a decrease in progress by >50%.

I think uncontextualized, the predictions don't seem that inaccurate. They're reasonably close. Contextualized, they seem pretty accurate.


  > The thing is, AI researchers have continually underestimated the pace of AI progress
What's your argument?

That because experts aren't good at making predictions that non-experts must be BETTER at making predictions?

Let me ask you this: who do you think is going to make a less accurate prediction?

Assuming no one is accurate here, everybody is wrong. So the question is who is more or less accurate. Because there is a thing as "more accurate" right?

  >> In 2022, they thought AI wouldn’t be able to write simple Python code until around 2027.
Go look at the referenced paper[0]. It is on page 3, last item in Figure 1, labeled "Simple Python code given spec and examples". That line is just after 2023 and goes to just after 2028. There's a dot representing the median opinion that's left of the vertical line half way between 2023 and 2028. Last I checked, 8-3 = 5, and 2025 < 2027.

And just look at the line that follows

  > In 2023, they reduced that to 2025, but AI could maybe already meet that condition in 2023
Something doesn't add up here... My guess, as someone who literally took that survey, is what's being referred to as "a simple program" has a different threshold.

Here's the actual question from the survey

  Write concise, efficient, human-readable Python code to implement simple algorithms like quicksort. That is, the system should write code that sorts a list, rather than just being able to sort lists.
  
  Suppose the system is given only:
    A specification of what counts as a sorted list
    Several examples of lists undergoing sorting by quicksort
Is the answer to this question clear? Place your bets now!

Here, I asked ChatGPT the question[1], it got it wrong. Yeah, I know it isn't very wrong, but it is still wrong. Here's an example of a correct solution[2] which shows the (at least) two missing lines. Can we get there with another iteration? Sure! But that's not what the question was asking.

I'm sure some people will say that GPT gave the right solution. So what that it ignored the case of a singular array and assumed all inputs are arrays. I didn't give it an example of a singular array or non-array inputs, but it did just assume. I mean leetcode questions pull out way more edge cases than I'm griping on here.

So maybe you're just cherry-picking. Maybe the author is just cherry-picking. Because their assertion that "AI could maybe already meet that condition in 2023" is not unobjectively true. It's not clear that this is true in 2025!

[0] https://arxiv.org/abs/2401.02843

[1] https://chatgpt.com/share/688ea18e-d51c-8013-afb5-fbc85db0da...

[2] https://www.geeksforgeeks.org/python/python-program-for-inse...


>Go look at the referenced paper[0]. It is on page 3, last item in Figure 1, labeled "Simple Python code given spec and examples". That line is just after 2023 and goes to just after 2028. There's a dot representing the median opinion that's left of the vertical line half way between 2023 and 2028. Last I checked, 8-3 = 5, and 2025 < 2027.

The graph you're looking at is of the 2023 survey, not the 2022 one

As for your question, I don't see what it proves. You described the desired conditions for an a sorting algorithm and chatGPT implemented a sorting algorithm. In the case of an array with one element, it bypasses the for loop automatically and just returns the array. It is reasonable for it to assume all inputs are arrays because your question told it that its requirements were to create a program that " turn any list of numbers into a foobar."

Of course I'm not any one of the researchers asked about their predictions in the survey, but I'm sure if you told them "a SOTA AI in 2025 produced working human readable code based on a list of specifications, and is only incorrect by a broad characterization of what counts as an edge case that would trip up a reasonable human coder on the first try", I'm sure the 2022 or 2023 respondents would say that it meets their criteria for their threshold.


  > As for your question, I don't see what it proves.
The author made a claim

I showed the claim was false

The author bases his argument on this and similar claims. Showing his claim is false says he's argument doesn't hold

  > and is only incorrect by a broad characterization
I don't know of I'd really call a single item an "edge case" so much as generalization.

But I do know I'd answer that question differently given your reframing.


Paying attention to trends is how you lose your money on a hype train.


Even if we spent 1 million years on LLM it will not result in AGI, we are no closer to AGI with LLM technology than we were with toaster technology


“Would you like a toasted teacake?”


I can't believe this is so unpopular here. Maybe it's the tone, but come on, how do people rationally extrapolate from LLMs or even large multimodal generative models to "general intelligence"? Sure, they might do a better job than the average person on a range of tasks, but they're always prone to funny failures pretty much by design (train vs test distribution mismatch). They might combine data in interesting ways you hadn't thought of; that doesn't mean you can actually rely on them in the way you do on a truly intelligent human.


I think it’s selection bias - a y-combinator forum is going to have a larger percentage of people who are techno-utopianists than general society, and there will be many seeking financial success by connecting with a trend at the right moment. It seems obvious to me that LLMs are interesting but not revolutionary, and equally obvious that they aren’t heading for any kind of “general intelligence”. They’re good at pretending, and only good at that to the extent that they can mine what has already been expressed.

I suppose some are genuine materialists who think that ultimately that is all we are as humans, just a reconstitution of what has come before. I think we’re much more complicated than that.

LLMs are like the myth of Narcissus and hypnotically reflect our own humanity back at us.


Maybe somehow this will be true in the future, but I am finding that as soon as you work on a novel or undocumented or non internet available problem it is just a hallucinating junior dev


The dirty secret is most of the time we are NOT working on anything novel at all. It is pretty much a CRUD application and it is pretty much a master detail flow.


Even for completely uninteresting CRUD work, you're better off with better deterministic tooling (scaffolding for templates, macros for code, better abstractions generally). Unfortunately, past a certain low level, we're stuck rolling our own for these things. I'm not sure why, but I am guessing it has to do with them not being profitable to produce and sell.


I work on novel technologies.


Yeah me too, better off buying something if it already exists


Universities are research oriented, kubernetes is a practical skills which you'd expect to learn at a trade school no different to learning fluid dynamics vs plumbing


I have never met a single human being in person who did not believe that a crucial part of acquiring practical skills in software (not "trades") for young people is to go to university. Maybe people shouldn't theoretically expect this, but there is theory, and then there is reality. We need to make them match, in whichever direction.


Yes I'm sure people say that but that's because they don't think before they speak

You wouldn't expect the same from a doctor, lawyer, engineer, the problem is that everyday people aren't aware that there is a difference between software development and computer science...


The auditors are using llms too!


Does software that produces files have an obligation to provide interoperability?


When they have a monopoly, places like the EU will frown on purposefully breaking compatibility.

Its called antitrust.


> When they have a monopoly, places like the EU will frown on purposefully breaking compatibility.

What exactly have they done about it?


Without knowing too much of the EU history, I have always understood that anti-trust pressure from the EU effectively forced Microsoft to publish the OOXML spec in the first place.


> Purposefully

According to who? With what proof? And how/why do they get to be the arbiters of that?


> According to who? With what proof?

They normally get asked to investigate by other interested parties, and then ask other independent experts in the field.

> And how/why do they get to be the arbiters of that?

By being the government?

Microsoft doesn't have to sell their software in Europe if they don't like the rules.


no, only if you have a quasi monopoly on Office Application in pretty much every single (western) government through all departments and sectors.


Lol you'd hate to see what blazor is doing then


Or Phoenix.LiveView for that matter.


I have no hate/love relation to that matter. Tbh I don't care, but my phone gets hot when it has to load another 5/10/20/100MB Single Page Application that displays a few lines of nicely formatted text, an animated background and a button "subscribe"

By the way, GWT did it before.


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: