It's a blob that costs over $10,000,000 in electricity costs to compile. Even if...

soulofmischief · on April 18, 2024

There is an argument to be made about the importance of archeological preservation of the provenance of models, especially the first few important LLMs, for study by future generations.

In general, software rot is a huge issue, and many projects which may be of future archeological importance are increasingly non-reproducible as dependencies are often not vendored and checked into source, but instead downloaded at compile time from servers which lack strong guarantees about future availability.

_akhe · on April 19, 2024

This is comment is cooler than my Arctic Vault badge on GitHub.

Who were the countless unknown contemporaries of Giotto and Cimabue? Of Da Vinci and Michelangelo? Most of what we know about Renaissance art comes from 1 guy - Giorgio Vasari. We have more diverse information about ancient Egypt than the much more recent Italian Renaissance because of, essentially, better preservation techniques.

Compliance, interoperability, and publishing platforms for all this work (HuggingFace, Ollama, GitHub, HN) are our cathedrals and clay tablets. Who knows what works will fill the museums of tomorrow.

HarHarVeryFunny · on April 18, 2024

In today's Dwarkesh interview, Zuckerberg talks about energy becoming a limit for future models before cost or access to hardware does. Apparently current largest datacenters consume about 100MW, but Zuck is considering future ones consuming 1GW which is the output of typical nuclear reactor!

So, yeah, unless you own your own world-class datacenter, complete with the nuclear reactor necessary to power the training run, then training is not an option.

krisoft · on April 19, 2024

On a sufficiently large time scale the real limit on everything is energy. “Cost” and “access to hardware” are mere proxies for energy available to you. This is the idea behind the Kardashev scale.

HarHarVeryFunny · on April 19, 2024

A bit odd to see this downvoted... I'm not exactly a HN newbie, but still haven't fully grasped the reasons people often downvote here - simply not liking something (regardless of relevance or correctness) seems to often be the case, and perhaps sometimes even more petty reasons.

I think Zuck's discussion of energy being the limiting factor was one of the more interesting and surprising things to come out of the Dwarkesh interview. We're used to discussion of the $1B, $10B, $100B training runs becoming unsustainable, and chip shortages as an issue, but (to me at least!) it was interesting to see Zuck say that energy usage will be a disruptor before those do (partly because of lead times and regulations in expanding power supply, and bringing it in to new data centers). The sheer magnitude of projected power consumption needed is also interesting.

robertlagrant · on April 20, 2024

There is an odd contingent or set of contingents on here that do seem to down vote by ideology rather than lack of facts or lack of courtesy. It's a bit of a shame, but I'm not sure there's much to be done.