Hacker News new | past | comments | ask | show | jobs | submit login

I love the title "Big LLMs" because it means that we are now making a distinction between big LLMs and minute LLMs and maybe medium LLMs. I'd like to propose the we call them "Tall LLMs", "Grande LLMs", and "Venti LLMs" just to be precise.



I'd prefer to see olive sizes get a renaissance. I was always amused by Super Colossal when following my mom around a store as a little kid.

From a random web search, it seems the sizes above Large are: Extra Large, Jumbo, Extra Jumbo, Giant, Colossal, Super Colossal, Mammoth, Super Mammoth, Atlas.


How about wine bottle sizes since we're "bottling" a "distillation" of information...

https://en.wikipedia.org/wiki/Wine_bottle#Sizes


To get pedantic, wine is not a product of distillation.


That almost makes the metaphor more apt. Wine is the real deal, and brandy is the distilled approximation.


Needs more superlatives. “Biggest” < “Extra Biggest” < “Maximum Biggest”. :D


maximum_biggest_final_2


"Non Plus Ultra"

Followed by another company introducing their "Plus Ultra" model.


And I'd love to see data compression terminology get an overhaul. Do we need big LLMs or just succinct data structures? Or maybe "compact" would be good enough? (Yeah LLMs are cool but why not just, you know, losslessly compress the actual data in a way that lets us query its content?)


Well the obvious answer is that LLMs are more then just pure search. They can synthesize novel information from their learned knowledge.


And the US ‘small’ LLMs will actually be slightly larger than the ‘large’ LLMs in the UK.


I wonder how does the skinnies get dressed oversea: I wear European S which translate to XXS in the US, but there’s many people skinnier than me, still within a “normal" BMI. Do they have to find XXXS? Do they wear oversized clothes? Choosing trousers is way easier because the system of cm/inches of length+perimeter correspond to real values.


It's a crazy experience being just physically larger than most of the world. Especially when the size on the label carries some implicit shame/judgement. Like I'm skinny, I'm pretty much the lowest weight I can be and not look emaciated / worrying. But when shopping for a skirt in Asian sizes I was a 4XL, and usually an or L-2XL in European sizes. Having to shift my mental space that a US M is the "right" size for me was hard for many years. But like I guess this is how sizing was always kinda supposed to work.


The shame you feel is yours, it's not inherent to the sizing.


The shame is inherent to the crushing expectations put on women's appearances and the pressure to be small. It manifests in clothing sizing for the same reason it manifests standing on a scale, it's a measure of your smallness. And what makes it insidious is that the measures are juuust comparable enough across different people to make people feel bad for not having the same numbers as someone 5" shorter than you.

And my experience isn't unique in any way here and it's really hard to not see it pervasive through our culture.


Short men also tend to voice the same complaints you have. They tend to have the added strain of absolute helplessness in their situation.


Uniqlo sizing looks pretty standard to what we have in Europe...


> Choosing trousers is way easier because the system of cm/inches of length+perimeter correspond to real values.

They're not merely real values, they're also rational.


I'm not so sure, there's pi involved here!


I worked at a Norwegian hospital once which had sizes from xxl (ekstra ekstra liten) to xxs (ekstra ekstra stor). So it's simple, you cross the ocean, you go from size xxl to xxs without having to do anything at all...

I should say though, that's the only place I've seen this particular localization.


We ordered swag T-shirts for a conference from two providers, but EU provider L's were actually larger than US L!


It's funny you say that, but when travelling abroad I wondered how Europeans and Japanese stay sufficiently hydrated.


For healthy adults, thirst is a perfectly adequate guide to hydration needs. Historically normal patterns of drinking - e.g. water with meals and a few cups of tea or coffee in between - are perfectly sufficient unless you're doing hard physical labour or spending long periods of time outdoors in hot weather. The modern American preoccupation with constantly drinking water is a peculiar cultural phenomenon with no scientific basis.


Don't many medications dehydrate you though? And Americans are on a lot of medications.


I've always understood constantly drinking water as a ruse to use the bathroom more often, which is helpful for Americans with sedentary lifestyles.


If you are thirsty you are already dehydrated.


Try getting a kidney stone and then find out if adequate hydration is what you want to squeak by with.


Diabetes causes dehydration


Is this a thing about how restaurants in some European countries charge for water?


Its a joke about Americans carrying around giant water bottles


And for public toilets. I mean restrooms.


> The UK

You mean the EU, right? The UK isn't covered by the AI act.

/s


Big LLM is too long as a name. We should agree on calling them BLLMs. Surely everyone is going to remember what the letters stand for.


I still like Big Data Statistical Model



Bureau of Large Land Management


I want to apologize for this joke in advance. It had to be done.

We could take a page from Trump’s book and call them “Beautiful” LLMs. Then we’d have “Big Beautiful LLMs” or just “BBLs” for short.

Surely that wouldn’t cause any confusion when Googling.


Weirdly enough, the ITU already chose the superlative for the bigliest radio frequency band to be Tremendous:

- Extremely Low Frequency (ELF)

- Super Low Frequency (SLF)

- Ultra Low Frequency (ULF)

- Very Low Frequency (VLF)

- Low Frequency (LF)

- Medium Frequency (MF)

- High Frequency (HF)

- Very High Frequency (VHF)

- Ultra High Frequency (UHF)

- Super High Frequency (SHF)

- Extremely High Frequency (EHF)

- Tremendously High Frequency (THF)

Maybe one day some very smart people will make Tremendously Large Language Models. They will be very large and need a lot of computer. And then you'll have the Extremely Small Language Model. They are like nothing.

https://en.wikipedia.org/wiki/Radio_frequency?#Frequency_ban...


"The Overwhelmingly Large Telescope (OWL) was a conceptual design by the European Southern Observatory (ESO) organisation for an extremely large telescope, which was intended to have a single aperture of 100 metres in diameter. Because of the complexity and cost of building a telescope of this unprecedented size, ESO has decided to focus on the 39-metre diameter Extremely Large Telescope instead."

https://en.m.wikipedia.org/wiki/Overwhelmingly_Large_Telesco...


AFAIK "tremendously" was chosen partly because the range includes 1 "T"Hz.


I like tremendous as an adjective for a frequency range because etymologically it can be traced to the Latin word for 'shaking'. Tremendous, horrendous, terrible all kinda mean "makes you shake".

Horrendous being based on the Latin root for "trembling with fear", tremendous on another Latin root meaning "shaking from excitement" and terrible deriving from a Greek root for, again, "trembling with fear".


I hope they go with "Ludicrous" like in Spaceballs.


It bothers me that the level below 3 Hz is not given the name "Tremendously low". Now it's not symmetrical. I hope the ITU is happy...


XKCD telescope sizes also could provide some guidance

https://xkcd.com/1294/


TLLM is close to TLM


I've sat in more than one board meeting watching them take 20 minutes to land on t-shirt sizes. The greatest enterprise sales minds of our generation...


I've seen things you people wouldn't believe.

I’ve seen corporate slogans fired off from the shoulders of viral creatives. Synergy-beams glittering in the darkness of org charts. Thought leadership gone rogue… All these moments will be lost to NDAs and non-disparagement clauses, like engagement metrics in a sea of pivot decks.

Time to leverage.


... destroyed by madness, starving hysterical! Buying weed in a store then meeting with someone off Craiglist to score eggs.


Name them like clothing sizes: XXLLM, XLLM, LLM, MLM, SLM, XSLM XXSLM.


i did this!

XXLLM: ~1T (GPT4/4.5, Claude Opus, Gemini Pro)

XLLM: 300~500B (4o, o1, Sonnet)

LLM: 20~200B (4o, GPT3, Claude, Llama 3 70B, Gemma 27B)

~~zone of emergence~~

MLM: 7~14B (4o-mini, Claude Haiku, T5, LLaMA, MPT)

SLM: 1~3B (GPT2, Replit, Phi, Dall-E)

~~zone of generality~~

XSLM: <1B (Stable Diffusion, BERT)

4XSLM: <100M (TinyStories)

https://x.com/swyx/status/1679241722709311490


MLM... uh oh


I hate those ponzi schemes! Never buy a cutco knife or those crappy herbalife supplements.

Alternatively, just make sure you keep things consensual, and keep yourself safe, no judgement or labels from me :)


I've been labeling LLMS as "teensy", "smol", "mid", "biggg", "yuuge". I've been struggling to figure out where to place the lines between them though.


itsy-bitsy <= 3B

teensy 4B to 29B

smol 30B to 59B

mid 60B to 99B

biggg 100B to 299B

yuuge 300B+


But of course these are all flavors of "large", so then we have big large language models, medium large language models, etc, which does indeed make the tall/grande/venti names appropriate, or perhaps similar "all large" condom size names (large, huge, gargantuan).


Why not LLLM for large LLM’s and SLLM for small LLM’s, assuming there is no middle ground


M, LM, LLM, LLLM, L3M, L4M.

Gotta leave room for future expansion.


Hopefully the USB making team does NOT step into this...

LLM 3.0, LLM 3.1 Gen 1, LLM 3.2 Gen 1, LLM 3.1, LLM 3.1 Gen 2, LLM 3.2 Gen 2, LLM 3.2, LLM 3.2 Gen 2x2, LLM 4, etc...


2L4M


VLLM, Super VLLM, Almost Large Language Model


What makes it a Small Large Language Model? Why jot just an SLM?


Smedium Language Model


Lousy Smarch weather


If we can’t have fun with names, why even be in IT?


S and L cancel out, so it just an LM.


Small !== -Large


SLM is a widespread term already.


Slim pickings, then?


LLM, LLM 2.0, LLM 3.0, Mini LLM, Micro LLM, LLM C.


LLM 95, LLM 98, LLM Millennium Edition, LLM NT, LLM XP, LLM 2000, LLM 7

I really appreciated the way they managed to come up with a new naming scheme each time, usually used exactly once.


Could always go with the Bungie approach for the Marathon series: LLM, LLM2, LLM∞, ℵ₁ — https://alephone.lhowon.org

(Obviously ∞ is for the actual singularity, and ℵ₁ is the thing after that).


Are you sure that ℵ1 is the thing after that?

https://en.m.wikipedia.org/wiki/Continuum_hypothesis

;-)


LLM 3.11 for Workgroups


can we have tiny LLM that can run on smartphone now


Apple Intelligence has an LLM that runs locally on the iPhone (15 Pro and up).

But the quality of Apple Intelligence shows us what happens when you use a tiny ultra-low-wattage LLM. There’s a whole subreddit dedicated to its notable fails: https://www.reddit.com/r/AppleIntelligenceFail/top/?t=all

One example of this is “Sorry I was very drunk and went home and crashed straight into bed” being summarized by Apple Intelligence as ”Drunk and crashed”.


I think the real problem with LLMs is we have deterministic expectations of non-deterministic tools. We’ve been trained to expect that the computer is correct.

Personally, I think the summaries of alerts is incredibly useful. But my expectation of accuracy for a 20 word summary of multiple 20-30 word summaries is tempered by the reality that’s there’s gonna be issues given the lack of context. The point of the summary is to help me determine if I should read the alerts.

LLMs break down when we try to make them independent agents instead of advanced power tools. Alot of people enjoy navel gazing and hand waving about ethics, “safety” and bias… then proceed to do things with obvious issues in those areas.


Larger LLMs can summarize all of this quite well though.


Determinism isn't the issue though. Many responses are fine. The displayed one is bad, whether chosen deterministically or not. Some alternatives:

- Passed out drunk

- Crashed in bed

- Slacking because drunk

...

The issue isn't a lack of context; it's that even the available context was handled poorly.


No. Smartphone only spin animated gif while talk to big building next to nuclear reactor. New radio inside make more efficient.


Is a tiny large language model equivalent to a normal sized one?


Yes is called MLM (Medium Language Model)


I expect that the phone will only do the prompt parsing


I want a tiny_phone_based LLM to do thought tracking and comms awareness..

I actually applied to YC in like ~2014 or such for thus;

-JotPlot - I wanted a timeline for basically giving a histo timeline of comms btwn me and others - such that I had a sankey-ish diagram for when and whom and via method I spoke with folks and then each node eas the message, call, text, meta links...

I think its still viable - but my thought process is too currently chaotic to pull it off.

Basically looking at a timeline of your comms and thoughts and expand into links of thought - now with LLMs you could have a Throw Tag od some sort whereby you have the bot do work on research expanding on certain things and plugging up a site for that Idea on LOCAL HOST (i.e. your phone so that you can pull up data relevant to the convo - and its all in a timeline of thought/stream of conscious

hopefully you can visualize it...


I had a thought that I think some people value social media (e.g. Facebook) essentially for this. Like giving up your Facebook profile means giving up your history or family tree or even your memories.

So in that sense, maybe people would prefer a private alternative.


I read this in Sam Wattersons voice with a pipe abt maybey an inch from his beard,

(Fyi I was a designer at fb and while it was luxious I still hated what I saw in zucks eyes every morn when I passed him.

Super diff from Andy Grove at intel where for whateveer reason we were in the sam oee schekdule

(That was me typing with eues ckised as a test (to myself, typos abound


Terrible names, to be honest. My proposal: Hyper LLMs, Ultra LLMs, Large LLMs, Micro LLMs, Mobile LLMs.


LLM M4 Ultra Pro Max 16e (with headphone jack)


GPT Inside


LLM already has one large in it…


If we can have a "Personal PIN Identification Number", we can have a "Large LLM Language Model".


What about Impersonal PIN anonymization letter?


Redundundant


What does a 20 LLM signify?


or "DietLLM, RegularLLM, MealLLM and SuperSizedLLMWithFries"


it's too bad vLLM and VLM are taken because it would have been nice to recycle the VLSI solution to describing sizes - get to very large language models and leave it at that.


After very large language models, the next step is mega language models, or MLMs. As a bonus, it describes the VC funding scheme that backs them too.


we could also look to magnetoresistance and go for giant, colossal, extraordinary


Doesn't the first L in LLM mean large already?

It's like saying Automated ATM. Whoever wrote it barely knows what the acronym means.

This whole article feels like written by someone who doesn't understand the subject matter at all


We’re fine with “The big friendly giant” and the sahara desert (“desert desert”); big llm could join the family of pleonasms.

https://en.m.wikipedia.org/wiki/Pleonasm


When it's a different language it's fine.


Yes, that's the point of the comment and the whole discussion here. LLMs are already Large so what should the prefix be? Big LLM is a strong contender. I'm also pretty sure the creator of redis is not "someone who doesn't understand the subject matter at all".


It's very common for experts on one subject to take a jab at another subject and depend on their reputation while their skillset doesn't translate at all.


Almost everyone says ‘PIN number’ as well.


Dismissed, Big LLM will live on along with Big Data.


Well, big data for me was always clear -- when data sizes are too large to use regular tools (ls, du, wc, vi, pandas).

I.e. when pretty much every tool or script I used before doesn't work anymore, and need a special tool (gsutil, bq, dusk, slurm), it's a mind shift.


Then there will be "decaf LLM"


Pro, max, ultra…


"big large language model" renminds me uncomfortably of "automated teller machine machine"


“There are 2 hard problems in computer science: cache invalidation, naming things, and off-by-1 errors.“





Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: