More

minimaltom · 2026-05-08T23:02:34 1778281354

This is a great writeup! thank you!!

Reminds when i did noogler training back in the day and one of the talks described a cascading failure at a datacenter, starting with a cat which was too curious near a power conditioner, and briefly conducted

AdamJacobMuller · 2026-05-09T00:55:17 1778288117

The cat incident at a facility I worked at.

Its cold up here in the winter, sadly, the residual heat from even totally passive components like switch gear is enough to warm things up enough to attract them. .001% of 1MW of power is still quite warm. (I have no idea how much switchgear leaks but i know they are warm even in winter outdoors).

And, yeah, the rest of the writeup is also an amalgamation of some panic-inducing experiences in my life.

bombcar · 2026-05-09T04:51:58 1778302318

https://x.com/visakanv/status/1678745111411212290

minimaltom · 2026-05-08T22:59:25 1778281165

They absolutely have backups, I presume they were ineffective or also down for _reasons_.

minimaltom · 2026-05-08T00:30:50 1778200250

Nah, even insane token costs don't come close to the costs of labor.

Most likely this is just 'AI-washing' - dressing a layoff for economic reasons (such as propping up their shrinking margins) as something more palatable to investors (AI).

minimaltom · 2026-05-07T22:26:30 1778192790

Between this, the emotions paper, golden gate claude etc, it doesn't seem like such a stretch that Anthropic are doing some kind of activation steering as part of training (and its part of their lead)

2001zhaozhao · 2026-05-07T23:04:13 1778195053

it could be helpful in gettig their learnings to generalize from RL

minimaltom · 2026-05-07T21:56:45 1778191005

This attack class lets you escalate from any user to UID 0. Not running as root won't save you, in fact, this attack is for those processes not running as root.

However, if you are in a user namespace where UID 0 doesn't map to system-wide capabilities, and you dont share page cache for the setuid binaries on the system, this attack doesn't lead to LPE.

delamon · 2026-05-08T08:45:22 1778229922

setuid binaries are not the only way to get root. E.g. one can change /etc/crontab or /etc/passwd. Or add trojan to /bin/ls and wait until admin type 'ls'

quantummagic · 2026-05-08T10:08:31 1778234911

It's not always as easy as you imply. All the attack vectors you mentioned, require root on the host, before you can make the change or install the trojan.

delamon · 2026-05-08T12:43:23 1778244203

The attack gives you ability to overwrite any cached page. So you don't need to be root to "edit" /etc/passwd.

quantummagic · 2026-05-08T13:05:58 1778245558

Not of the host system, assuming we're talking about a compromised VM, running as a non-root user.

delamon · 2026-05-08T13:41:34 1778247694

I assume you mean container, not VM. But yes, container makes it harder.

minimaltom · 2026-05-08T22:58:17 1778281097

Worth adding also that you can only use these vectors to corrupt the page cache for files reachable in your mount namespace.

Usually with containers, almost nothing is shared with the host namespaces (tho likely shared with other container namespaces, hopefully none of those are --priv).

minimaltom · 2026-05-06T22:28:39 1778106519

> That is, if the batch signal on a parameter exceeds its leave-one-out noise, update it; if not, skip it. This is a one-line change to Adam that accelerates grokking by 5x, suppresses memorization in PINNs, and improves DPO fine-tuning, eliminating the need for validation sets entirely.

Does anyone understand the formula they expressed above this sentence? is this just the classic "skip updating parameters with high gradient/loss variance in multiple batches/samples" ?

yorwba · 2026-05-07T06:33:00 1778135580

What is classic about "skip updating parameters with high gradient/loss variance in multiple batches/samples"? Do you have a particular algorithm in mind that uses this heuristic?

minimaltom · 2026-05-07T22:03:16 1778191396

Theres been multiple papers discussing how only updating parameters that have high agreement in update direction leads to less overfitting and better generalization. Lemme see if I can find em.

https://arxiv.org/abs/2411.16085 - set updates to 0 where theres disagreement in the sign of the parameter update - got accepted!

https://arxiv.org/pdf/2412.18052 - discard gradient updates from batches/minibatches that disagree where disagree means cosine distance threshold (they solved for 0.97 or something being optimal)

minimaltom · 2026-05-05T03:29:12 1777951752

my guess is they want to do AI/O as part of their event loop explicitly, and blocking a thread in a syscall waiting for an IOP (ala std::fs) isn't the vibe.

zamalek · 2026-05-05T09:23:05 1777972985

Ah good point, complete brain fart on my part.

minimaltom · 2026-04-26T00:59:01 1777165141

high surface energy?

mahm/father I yearn for the spicy surface

minimaltom · 2026-04-14T02:03:30 1776132210

The archive-handling code was in lean-zip, it just seems the verifiers forgot to write proofs for it (still a bug).

Thats not the main finding of the article however. The main bug found was actually in the lean runtime, affecting all proofs using scalar arrays where the size of the array is not bounded.

minimaltom · 2026-04-14T01:57:27 1776131847

Is it fair when it comes to formally-verified software to only consider bugs that violate a proof, and ignore everything else?

Formally-verified software is usually advertised "look ma no bugs!" Not "look ma no bugs*" *As long as this complicated chain of lemmas appropriately represents the correctness of everything we care about.

In boating theres often debate of right of way rules in certain situations, and some people are quick to point out that giant tanker ships should be giving way to tiny sailboats and get all worked up about it*. The best answer I've heard: they're dead right! that is to say as right as they are dead (if they didnt yield) lol. In the same vein, I think someone who assumed that a formally-verified software was perfect and got hacked or whatever is going to be a bit wiggly about the whole thing.

* = Technically the rules prioritize the tankers if they are "restricted in ability to maneuver" but everyone loves to argue about that.

grg0 · 2026-04-14T02:51:54 1776135114

Nobody with experience in the field advertises formally-verified software like that, and it is understood that the spec may as well be wrong. It is also understood that the non-verified parts may have bugs (surprise). There is no news here.

minimaltom · 2026-04-14T03:25:34 1776137134

Unless "with experience in the field" == academia, disagree. In particular I remember the early discourse & hype around Wireguard, it was discussed as if perfection was an achieved outcome.