More

spxneo · on May 8, 2024

all around automation sucks with LLM thrown on top of it

the statistics are not in its favour

visarga · on May 8, 2024

Code is also hard. You got to generate code that accounts for all possible exceptions or errors. If you want to automate an UI for example, pushing a button can cause all sorts of feedback, errors, consequences that need to be known to write the code.

KhoomeiK · on May 8, 2024

Yep, until you generate code—it's harder from a technical POV but you can get way higher performance & reliability.

spxneo · on May 8, 2024

see this im not even sure

how does openrouter tolerate so many users without risking getting its api keys banned

spxneo · on May 8, 2024

any recommendations on local classifier/smol llm

spxneo · on May 7, 2024

strongly doubt we will see M5 so soon

thejazzman · on May 8, 2024

the M3 was released Oct 30, 2023 the M4 was released May 7, 2024

[disco stu] if these trends continue, the M5 will be out on November 14, 2024

spxneo · on May 7, 2024

I literally don't give a fck about Intel anymore they are irrelevant

The taiwanese silicon industrial complex deserves our dollars. Their workers are insanely hard working and it shows in its product.

benced · on May 7, 2024

There's no Taiwanese silicon industrial complex, there's TSMC. The rest of Taiwanese fabs are irrelevant. Intel is the clear #3 (and looks likely-ish to overtake Samsung? We'll see).

spxneo · on May 7, 2024

damn bro thanks for this

here i am celebrating not pulling the trigger on M2 128gb yesterday

now im realizing M4 ain't shit

will wait a few more months for what you described. will probably wait for AMD

> Given that Microsoft has defined that only processors with an NPU with 45 TOPS of performance or over constitute being considered an 'AI PC',

so already with 77 TOPS it just destroys M4. Rumoured to hit the market in 2 months or less.

spxneo · on May 7, 2024

almost went for M2 128gb to run some local llamas

glad I held out. M4 is going to put downward pressure across all previous gen.

edit: nvm, AMD is coming out with twice the performance of M4 in two months or less. If the M2s become super cheap I will consider it but M4 came far too late. There's just way better alternatives now and very soon.

kalleboo · on May 8, 2024

> AMD is coming out with twice the performance of M4 in two months or less

M4 Pro/Max/Ultra with variants double+ the performance by just scaling cores are probably also going to be announced at WWDC in a month when they also announce their AI roadmap

spxneo · on May 5, 2024

this is quite worrying for OpenAI as the rate token prices have been plummeting thanks to Meta and its going to have to keep cutting its prices while capex remains flat. whatever Sam says in interviews just think the opposite and the whole picture comes together.

It's almost a mathematical certainty that people who invested in OpenAI will need to reincarnate in multiple universes to ever see that money again but no bother many are probably NVIDIA stock holders to even out the damage.

jiggawatts · on May 5, 2024

There’s a Pareto frontier where Meta is pushing out the boundaries along the “private” and “cheap” axes.

Open AI can release GPT 4.5 or 5 and push out the boundary in the direction of “correctness” and “multimodality”.

Either way, we win as customers while the the level of competition remains this hot.

I personally want a smart AI much more than a cheap or fast one. Your mileage may vary.

mft_ · on May 5, 2024

Well, Pareto is about optimisation, not either/or. I want a model that’s smart enough, while also being locally-executable.

I don’t know whether/when we’ll get there, and whether it will be improvements in models, or underlying model technology, or GPU/TPUs with larger memory at a consumer price point, or something else, that will deliver it.

jiggawatts · on May 5, 2024

That's just the middle of the Pareto frontier. Some people want one corner, others the other corner. It's like compression. You can have light compression and high speed, vice-versa, or a balance. You want balance. I want one of the corners.

michelsedgh · on May 5, 2024

I agree with you somewhat. You are correct unless they have a much better GPT model that have not released for whatever reason. They are a year ahead than competitors and GPT4 is pretty old now. I find it hard to believe they don’t have much more capable models now. We Will see though

j45 · on May 5, 2024

The polish of OpenAI stuff when released has been quite mature since gpt4 or even 3.5.

They are no doubt sitting on ultra polished stuff. When you are the tip of the arrow though and the cutting edge itself it might not be as efficient but does it ever show you things you can’t unsee.

When OpenAI can launch a video thing a day after because it’s ready to go. I am less and less skeptical e dry time they ship because the quality of the first version isn’t sliding back wards even in different areas like video.

Maybe releasing it is strategic, or releasing it also requires supporting it infrastructure wise and then some. That might be a challenge.

My feeling is the next model of an k between may have massive efficiency and performance improvements without having to go quantum with brute forcing it.

Meanwhile others who are following what OpenAI has done seem to be able to optimize it and make it more efficient whether it’s open source or otherwise.

Both are doing important work and I'm not sure I want to see it as a one winner take all game.

The way AI vendors are responding suddenly to another’s launch feels like they are always ready to launch and continue to add functionality to it that could also ship.

It reminds me of when Google spent a billion dollars advertising bing had a billion pages indexed. Google stayed quiet. Then when the money was spent by Microsoft, Google simply added a zero or two to their search page, when they used to list how many pages they have indexed. They were just sitting on it already done, announcing it when it’s to their benefit.

mark_l_watson · on May 5, 2024

Also, what will the effect of open models be on the LLM provider industry? What effect will Meta’s scorched earth policy of killing markets by releasing very good open models have?

I use LLMs constantly, but no longer in a commercial environment (I am retired except for writing books, performing personal research projects, and small consulting tasks). I now usually turn first to local models for most things: ellama+Emacs is good enough for me to mostly stop using GPT-4+Emacs or GitHub Copilot, the latest open 7B, 8B, 30B models running on my Mac seem sufficient for most of the NLP and data manipulating things I do.

However, it is also fantastic to have long context Gemini, OpenAI APIs, Claude, etc. available when needed or just to experiment with.

Plankaluel · on May 5, 2024

GPT-4 is not a single model. The GPT-4 that was released initially a year ago is way worse in benchmarks than the newest versions of it and the original version has been beat by quite a lot of other models by this point.

The newest version of GPT-4 is probably still overall the best model currently, but it is only a few months old, and the picture depends a lot on what benchmarks you are looking at.

E.g. for what we are doing at our company (document processing, etc.) Claude-3 Opus and Gemini-1.5 Pro are currently the better models. The newest GPT-4 even performed worse than a previous version.

So to me it def. seems like the gap is getting smaller. Of course, OpenAI could be coming out with GPT-5 next week and it could be vastly better than all other current models.

easygenes · on May 5, 2024

There's wide speculation that what will be branded as either GPT-4.5 or GPT-5 has finished pretraining now and is undergoing internal testing for a fairly near-term release.

michelsedgh · on May 5, 2024

My speculation is that internally they have much stronger models like Q* but they won’t be able to release them to public even if they want to for lack of compute and safety and other reasons they see probably…

kaliqt · on May 5, 2024

They don't actually care about safety, that's a lie, so compute and business strategy is the only thing stopping them.

SoRA is the same. It's not ready and it's too slow.

whimsicalism · on May 5, 2024

I am curious whether this is true - OAI at least has the reputation in the industry of caring the least about safety of the major labs

hhh · on May 5, 2024

If they don’t care about safety (or perceived safety), why do they spend so much time lobotomizing models for safety reasons?

Terretta · on May 5, 2024

market reach e.g. ability to have chat app on iOS (the API is less limited)

public relations, limit the edge case nonsense 'journalists' hype so corporate execs aren't terrorized into avoiding buying

doesn't have to be as smart as it could be, it just has to be smarter than other models, so might as well file down some sharp edges for sake of above

whimsicalism · on May 5, 2024

I didn’t say they don’t care about safety, merely that of the big labs they care the least or close to the least

realusername · on May 5, 2024

Because of PR reasons. They want to avoid government legislations and pretending that they care helps

RcouF1uZ4gsC · on May 5, 2024

> My speculation is that internally they have much stronger models like Q*

People used to speculate the same about Google. Everyone hypes up their “secret, too powerful to release” models. Remember the dude who was convinced that there was a sentient AI in the machine? The light of actual public release tends to expose a lot of the hype.

HeatrayEnjoyer · on May 5, 2024

That would be a reasonable assumption if OpenAI did not already have an established track record of repeatedly re-defining our fundamental expectations of what technology can do.

GPT-4 was already completed and secretly being tested on Bing users in India in mid-2022 (there were even Microsoft forum posts asking about the funny chatbot). Even after heavy quantization and the alignment tax GPT-4 is still the bar to beat. It's been two years and their funding has increased over 10x since then.

Short of a fundamental Hard Problem that they cannot overcome, their internal bleeding edge models can reasonably be assumed to possess significantly greater capabilities.

torginus · on May 5, 2024

Honestly I'm pretty puzzled by this mystical fog that hangs over OpenAIs skunkworks projects - don't people leave for other jobs/go to conferences etc.?

I'm surprised that nobody call tell what they infact do or do not have.

lightbritefight · on May 5, 2024

Truth tends to take the wind out of hypes sails.

With hundreds of billions on the line for the founders and a whole lot of likely unvested stock options for the employees, it doesnt seem like anyone wants to open up about whats actually going on day to day.

imtringued · on May 5, 2024

I'm not saying Claude 3 and Gemini are better than GPT4 in every aspect, but those two models can at least perform addition on arbitrarily long numbers, meanwhile GPT4 struggles.

j-bos · on May 5, 2024

Isn't that why he's making rounds to lock down the biggest AI's?

hiddencost · on May 5, 2024

I suspect that when it costs 0.5c per 100 million generated token, and you can generate 1000 tokens per second, they'll be very happy.

moralestapia · on May 5, 2024

Disclaimer: not a fan of "Open"AI

Everyone could say anything about open source models, but they're comparing themselves to what OpenAI released a year ago. They haven't shown all of their cards yet and they have a decent moat already in place; some say they have no moat, I disagree, they have one of the best moats possible which is brand awareness.

Sora on its own could bring in billions in revenue; an open-source Sora will take at least another year, if not two, to come out. Then more time until it can run on commodity hardware. An open source model that only runs in a dedicated H100 is actually less useful than a closed model behind an API call; not to detract from open source, I think it's the way to go but I'm just being pragmatic and realistic. There's a reason why MS Office is still the top productivity app in the world, even though dozens of open source alternatives exist.

Hendrikto · on May 5, 2024

> they have one of the best moats possible which is brand awareness.

Do they though?

If you talk to "regular people", everybody knows ChatGPT, but nobody knows or cares about OpenAI. And most of them don‘t even really know that name. They call it ChatUuuuhm, ChatThingy, Chad Gippity, or similar.

I think they will just switch, when something better comes along.

CapeTheory · on May 5, 2024

Good old Chatty Jeeps

moralestapia · on May 5, 2024

I don't really follow your logic ...

ChatGPT is a brand that belongs to OpenAI, that's ... not really hard to understand.

poslathian · on May 5, 2024

MS had yet to fully stabilize that lead a full decade after they had won the os platform standard for ibm compatible pcs. A platform standard moat goes way way beyond a brand advantage.

Azure, while significant, has no similar monopoly to support OpenAI. Do you really see a structural advantage to openAI beyond the Microsoft products integrating it?

moralestapia · on May 5, 2024

Can you clarify what is meant by "structural advantage"?

jstummbillig · on May 5, 2024

I disagree.

a) A year after GPT-4 set the bar, it's still the best model, despite everyone else not having to do it first. Just copy, and just software. And that's not for lack of trying by every other viable prime player on the planet with unprecedented acceleration.

Imagine any other piece of software, where the incumbent has a mere 2-3 year head start, in which they had to work out the entire product that everyone else, despite just having to copy and pressing the pedal through the floor is struggling just trying to catch up with.

b) The current models including GPT-4 are so bad. The few billions can be made by just by continue playing this game of improvements for a few years and getting better each year. I think people are wildly confused about how big this market is going to be when that happens. They are not squeezing hosting or compute. They are squeezing intelligence. Intelligence is the entire economy. The notion that there would ever not be room for multiple things here, maybe through size or specialisation or cost (as with all other intelligence), and that a few billion dollar are a big deal, is so strange to me.

c) The game will at some point, be mostly about infra and optimization. People come to the conclusion that's a problem for the incumbents, when our entire industry is mostly about infra and optimization. AWS is infra and optimization. I think even the average hn tinkerer understands that therein lies a proposition that's not exactly equivalent to "just rent a few servers and do it yourself".

anon373839 · on May 5, 2024

> A year after GPT-4 set the bar, it's still the best model

Debatable. Many people find Claude Opus superior, and I know I've found it consistently better for challenging coding questions. More importantly, the delta between GPT-4 and everything else is getting smaller and smaller. Llama 3 is basically interchangeable with GPT-4 for a huge number of tasks, despite its smaller size.

jstummbillig · on May 5, 2024

> Many people find Claude Opus superior

Many more do not, according to the LMSYS leaderboard.

> Llama 3 is basically interchangeable with GPT-4 for a huge number of tasks

Sure. I am sure the number approaches infinity, if you are willing to let the model inform the task. That's usually not what most people are looking for in a tool.

threeseed · on May 5, 2024

GPT-4 was released in March 2023.

Which means the research that went into it would've been finalised quite some time prior.

Meaning that you're getting close to a 2 year head start.

acheong08 · on May 5, 2024

While they still call it GPT-4, the one topping the rankings are newer iterations of it despite still retaining the same name. The latest one is from 2024-04-09. Sure that one probably finished training a few months ago but it is by no means a 2 year head start.

ashu1461 · on May 5, 2024

Agree, the delta is getting smaller. And for majority of the tasks you can use the Claude Sonet which is better than 3.5 and also fast.

But at the same time when you actually want to solve a complicated problem, deep down you know that only GPT 4 can crack it.

jstummbillig · on May 5, 2024

Even more important, you know that GPT-4 will probably also not crack it. Which is why the SOTA is not terribly interesting. The delta between GPT-4 and the competition has been closing but why anyone would assume that this is a trend and that it would continue with GPT-4.5 to competition, or GPT-5 to competition instead of the other way around is a mystery to me.

I am not saying it could not be true. But extrapolating from differences between current bad models to a future with better models is weird, specially when everyone seems to pretty much agree that scale is the difference between the two and scale is hard and exclusive.

anon373839 · on May 5, 2024

There’s a scatterplot that’s been circulating on Twitter. The trend lines show that since the time of GPT-2, open weights models have improved at a steeper rate than proprietary models, with the two on a path to intersect.

jstummbillig · on May 5, 2024

I would argue that's to be expected after the first generally accepted POC (GPT-3.5) was released, with it an entire industry created, and other companies actually started copying/competing in a big way.

It seems a stretch to read this as a continuing trend, when (from what I gather everyone agrees on) the way to better models seems to be ever more efficient handling of ever larger amounts of money, compute and data, with no reasonable limits in sight on any of the three.

anon373839 · on May 5, 2024

Scaling up LLMs is only going to go so far, and it will yield diminishing marginal returns on all of that money, compute, and data. It’s a regime of exponential increases in inputs for linear gains in the outputs - barring some technological breakthroughs which could come from anywhere, not just from OpenAI.

hackerlight · on May 5, 2024

Depends how good their next model is, and if they prevent leaks and departures so they can prolong the lead for an undetermined amount of time.

14u2c · on May 5, 2024

Most of the big "investments" in OpenAI are in the form of compute credits. I fail to see the downside of that.

spxneo · on May 4, 2024

2 minute papers?

spxneo · on May 4, 2024

Bill Hwang, Masayoshi Son are all just straight up gamblers.

tim333 · on May 5, 2024

I'd dispute that with Masayoshi Son. I mean all venture investing has an element of gambling to it but he's more an investor.

H8crilA · on May 4, 2024

Nothing wrong with gambling (i.e. wagering real resources against an uncertain outcome), the entire financial market / all of capitalism is a chaotic orchestra of millions of gambles.