Hacker Newsnew | past | comments | ask | show | jobs | submit | rahen's commentslogin

I use Emacs (org-mode with TeX and Beamer exports) for almost everything, it's my office suite among other things. The only time I still need LibreOffice is for diagrams and charts, and even that is slowly being replaced by Mermaid.

https://ridaayed.com/posts/create-diagrams-in-emacs-org-mode...


Seconded Emacs. I use it for almost everything. I started using it for text editing and it slowly crept into basic web browsing, RSS reader, word processing (with org-mode and Markdown), presentation (through Pandoc/Beamer) and Elisp/C programming. It's gotten to the point where the only four programs I use daily on my Mac are LibreWolf, FreeTube, Logic Pro and Emacs.

Seriously if I could use Logic Pro inside Emacs I would.


This is the first thing that came to my mind. Why pick an OCI container instead of an LXC container since it's a stateful workload?

Going OCI here only makes sense for temporary, disposable sessions.


This is odd. The current hash rate is around its nominal 5 GH/s, and neither any pool nor individual seems to be above 50%:

https://miningpoolstats.stream/monero

This Qubic group claims to concentrate 3 GH/s of hashing power, yet there has been no increase in the global hash rate either:

https://www.coinwarz.com/mining/monero/hashrate-chart

Could this be just a bait?


dumb question: i took a look at https://miningpoolstats.stream/ethereumclassic for ethereumclassic and f2pool.com seems to have ~64% of the total hashrate... is that a takeover as well ?


I mean, it means that eth classic's ledger is rewritable on a whim by that that pool, if it has central control.


You don't mean to suggest that a scammy cryptocurrency entity that is currently bragging about attacking a competing system might ... lie to people???? Is that possible?


Peek the % of unknown miners in the pie chart at the bottom

Also https://moneroconsensus.info/


LXD was forked as Incus, and it’s an absolute delight.

Seamless LXC and virtual machine management with clustering, a clean API, YAML templates and a built-in load balancer, it's like Kubernetes for stateful workloads.


Incus is fantastic. I think Proxmox is where everyone is migrating to after the VMWare/Broadcom fiasco, but people should seriously consider Incus as well.


This is worth undertaking. macOS's stricter approach to handling some questionable hacks in Emacs could improve the codebase across all platforms. The PGTK frontend for Emacs (the Wayland-native frontend) was derived from the macOS version for instance. It replaced much of the messy X11 code with a cleaner, more modular Cairo-based frontend, which could be further enhanced by adopting a cross-platform, more future-proof SDL toolkit.

https://appetrosyan.github.io/posts/emacs-widget

Hopefully, similar improvements can address the issues with large locks and the lack of proper threading.


The problem with the PGTK frontend is it is notoriously EXTREMELY slow. The latency on user input compared to the X11 (especially Lucid) version has some people reverting back to X11/Lucid.

When I do run Linux I run Wayland, I daily drive macOS, but better than both are what you already allude to: the Emacs widget toolkit which will focus on replacing the GUI frontend with SDL and also (equally potentially) introducing an actor-type framework (akin to BEAM's) for communication to decouple that GUI.


It's not just slow, it's also broken.

Maybe broken is the wrong word, but it handles chained chords differently and it breaks my workflow at least.

(C-s s C-s for instance needs you to release and press S again when lucid doesn't need it)


I wonder how hard it would be to run the pgtk version on macOS


I run emacs on MacOS specifying the `emacs-pgtk` build in my Nix config as the package. Seems to work quite well for me


You shouldn't root Graphene, it breaks its security model and is certainly the reason why Revolut doesn't work on your phone. It works like a charm on mine.


Just build it yourself. Or you can install glibc on both Void and Alpine, if you want the pre-built binary.


I know most people couldn’t care less about this, but those gimmicky animations probably consume more computing power than the entire Apollo project, which strikes me as unnecessary and wasteful. Given the choice, I’d much rather have a clean, efficient interface.

I tend to like Material Design in comparison. It’s clean, efficient, and usable. I just hope Google won’t try to "improve" it with annoying gimmicks and end up making things worse, like Apple did here.


"Flat" design is equally offensive by not demarcating controls as controls, or their state in an intuitive way.

Just as we were finally seeing UI step away from that BS, Apple jumps all the way back into much-scorned, cheesily-excessive skeuomorphism... adding a raft of failed ideas from 20 years ago.


Since this is in contrast to "wildly not flat and full of visual gimmicks": the modern "flat" style has severe (and very stupid) issues, yea. But "flat" has been around for a very long time in touch UI with clear control boundaries - just draw a box around it, maybe give it a background color.


That's better than plain text that just happens to be a hidden control, but text with a background color might just be... text with a background color, for emphasis. Or it's text with a background color, to distinguish it from editable text. A background color does not tell the user that it's a control.

A box around it? Slightly better, but still doesn't convey state. Sure, you can fill it in when it's "on," but that's still guesswork on the part of the user if he arrives to find it filled in already.


I'm pretty sure my Amiga 1000 had more computing power than the entire Apollo project. I mostly used it for games.


No need for an RPi 5. Back in 1982, a dual or quad-CPU X-MP could have run a small LLM, say, with 200–300K weights, without trouble. The Crays were, ironically, very well suited for neural networks, we just didn’t know it yet. Such an LLM could have handled grammar and code autocompletion, basic linting, or documentation queries and summarization. By the late 80s, a Y-MP might even have been enough to support a small conversational agent.

A modest PDP-11/34 cluster with AP-120 vector coprocessors might even have served as a cheaper pathfinder in the late 70s for labs and companies who couldn't afford a Cray 1 and its infrastructure.

But we lacked both the data and the concepts. Massive, curated datasets (and backpropagation!) weren’t even a thing until the late 80s or 90s. And even then, they ran on far less powerful hardware than the Crays. Ideas and concepts were the limiting factor, not the hardware.


> a small LLM, say, with 200–300K weights

A "small Large Language Model", you say? So a "Language Model"? ;-)

> Such an LLM could have handled grammar and code autocompletion, basic linting, or documentation queries and summarization.

No, not even close. You're off by 3 orders of magnitude if you want even the most basic text understanding, 4 OOM if you want anything slightly more complex (like code autocompletion), and 5–6 OOM for good speech recognition and generation. Hardware was very much a limiting factor.


I would have thought the same, but EXO Labs showed otherwise by getting a 300K-parameter LLM to run on a Pentium II with only 128 MB of RAM at about 50 tokens per second. The X-MP was in the same ballpark, with the added benefit of native vector processing (not just some extension bolted onto a scalar CPU) which performs very well on matmul.

https://www.tomshardware.com/tech-industry/artificial-intell...

John Carmack was also hinting at this: we might have had AI decades earlier, obviously not large GPT-4 models but useful language reasoning at a small scale was possible. The hardware wasn't that far off. The software and incentives were.

https://x.com/ID_AA_Carmack/status/1911872001507016826


> EXO Labs showed otherwise by getting a 300K-parameter LLM to run on a Pentium II with only 128 MB of RAM at about 50 tokens per second

50 token/s is completely useless if the tokens themselves are useless. Just look at the "story" generated by the model presented in your link: Each individual sentence is somewhat grammatically correct, but they have next to nothing to do with each other, they make absolutely no sense. Take this, for example:

"I lost my broken broke in my cold rock. It is okay, you can't."

Good luck tuning this for turn-based conversations, let alone for solving any practical task. This model is so restricted that you couldn't even benchmark its performance, because it wouldn't be able to follow the simplest of instructions.


You're missing the point. No one is claiming that a 300K-param model on a Pentium II matches GPT-4. The point is that it works: it parses input, generates plausible syntax, and does so using algorithms and compute budgets that were entirely feasible decades ago. The claim is that we could have explored and deployed narrow AI use cases decades earlier, had the conceptual focus been there.

Even at that small scale, you can already do useful things like basic code or text autocompletion, and with a few million parameters on a machine like a Cray Y-MP, you could reasonably attempt tasks like summarizing structured or technical documentation. It's constrained in scope, granted, but it's a solid proof of concept.

The fact that a functioning language model runs at all on a Pentium II, with resources not far off from a 1982 Cray X-MP, is the whole point: we weren’t held back by hardware, we were held back by ideas.


> we weren’t held back by hardware

Llama 3 8B took 1.3M hours to train in a H100-80GB.

Of course, it didn't took 1.3M hours (~150 years). So, many machines with 80GB were used.

Let's do some napkin math. 150 machines with a total of 12TB VRAM for a year.

So, what would be needed to train a 300K parameter model that runs on 128MB RAM? Definitely more, much more than 128MB RAM.

Llama 3 runs on 16GB VRAM. Let's imagine that's our Pentium II of today. You need at least 750 times what is needed to run it in order to train it. So, you would have needed ~100GB RAM back then, running for a full year, to get that 300K model.

How many computers with 100GB+ RAM do you think existed in 1997?

Also, I only did RAM. You also need raw processing power and massive amounts of training data.


You’re basically arguing that because A380s need millions of liters of fuel and a 4km runway, the Wright Flyer was impossible in 1903. That logic just doesn’t hold. Different goals, different scales, different assumptions. The 300K model shows that even in the 80s, it was both possible and sufficient for narrow but genuinely useful tasks.

We simply weren’t looking, blinded by symbolic programming and expert systems. This could have been a wake-up call, steering AI research in a completely different direction and accelerating progress by decades. That’s the whole point.


"I mean, today we can do jet engines in garage shops. Why would they needed a catapult system? They could have used this simple jet engine. Look, here is the proof, there's a YouTuber that did a small tiny jet engine in his garage. They were held back by ideas, not aerodynamics and tooling precision."

See how silly it is?

Now, focus on the simple question. How would you train the 300K model in 1997? To run it, you someone to train it first.


Reductio ad absurdum. A 300K-param model was small enough to be trained offline, on curated datasets, with CPUs and RAM capacities that absolutely existed at the time, especially in research centers.

Backprop was known. Data was available. Narrow tasks (completion, summarization, categorization) were relevant. The model that runs on a Pentium II could have been trained on a Cray, or across time on any reasonably powerful 90s workstation. That’s not fantasy, LeNet 5 with its 65K weight was trained on a mere Sun station in the early 90s.

The limiting factor wasn’t compute, it was the conceptual framing as well as the datasets. No one seriously tried, because the field was dominated by symbolic logic and rule-based AI. That’s the core of the argument.


> Reductio ad absurdum.

My dude, you came up with the Wright brothers comparison, not me. If you don't like fallacies, don't use them.

> on any reasonably powerful 90s workstation

https://hal.science/hal-03926082/document

Quoting the paper now:

> In 1989 a recognizer as complex as LeNet-5 would have required several weeks’ training and more data than were available and was therefore not even considered.

Their own words seem to match my assessment.

Training time and data availability determined how much this whole thing could advance, and researchers were aware of those limits.


I think a quad-CPU X-MP is probably the first computer that could have run (not train!) a reasonably impressive LLM if you could magically transport one back in time. It supported a 4GB (512 MWord) SRAM-based "Solid State Drive" with a supported transfer bandwidth of 2 GB/s, and about 800 MFLOPS CPU performance on something like a big matmul. You could probably run a 7B parameter model with 4-bit quantization on it with careful programming, and get a token every couple seconds.


This sounds plausible and fascinating. Let’s see what it would have taken to train a model as well.

Given an estimate of 6 FLOPs per token per parameter, training a 7B parameter model would require about 1.26×10^22 FLOPs. That translates to roughly 500 000 years on an 800 MFLOPS X-MP, far too long to be feasible. Training a 100M parameter model would still take nearly 70 years.

However, a 7M-parameter model would only have required about six months of training, and a 14M one about a year, so let’s settle on 10 million. That’s already far more reasonable than the 300K model I mentioned earlier.

Moreover, a 10M parameter model would have been far from useless. It could have performed decent summarization, categorization, basic code autocompletion, and even powered a simple chatbot with a short context, all that in 1984, which would have been pure sci-fi back in those days. And pretty snappy too, maybe around 10 tokens per second if not a little more.

Too bad we lacked the datasets and the concepts...


So if I understand correctly, the hardware paradigm is shifting to align with the now-dominant neural-based software model. This marks a major shift, from the traditional CPU + OS + UI stack to a fully neural-based architecture. Am I getting this right?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: