Hacker Newsnew | past | comments | ask | show | jobs | submit | pash's commentslogin

The inflection point was 2012, when AlexNet [0], a deep convolutional neural net, achieved a step-change improvement in the ImageNet classification competition.

After seeing AlexNet’s results, all of the major ML imaging labs switched to deep CNNs, and other approaches almost completely disappeared from SOTA imaging competitions. Over the next few years, deep neural networks took over in other ML domains as well.

The conventional wisdom is that it was the combination of (1) exponentially more compute than in earlier eras with (2) exponentially larger, high-quality datasets (e.g., the curated and hand-labeled ImageNet set) that finally allowed deep neural networks to shine.

The development of “attention” was particularly valuable in learning complex relationships among somewhat freely ordered sequential data like text, but I think most ML people now think of neural-network architectures as being, essentially, choices of tradeoffs that facilitate learning in one context or another when data and compute are in short supply, but not as being fundamental to learning. The “bitter lesson” [1] is that more compute and more data eventually beats better models that don’t scale.

Consider this: humans have on the order of 10^11 neurons in their body, dogs have 10^9, and mice have 10^7. What jumps out at me about those numbers is that they’re all big. Even a mouse needs hundreds of millions of neurons to do what a mouse does.

Intelligence, even of a limited sort, seems to emerge only after crossing a high threshold of compute capacity. Probably this has to do with the need for a lot of parameters to deal with the intrinsic complexity of a complex learning environment. (Mice and men both exist in the same physical reality.)

On the other hand, we know many simple techniques with low parameter counts that work well (or are even proved to be optimal) on simple or stylized problems. “Learning” and “intelligence”, in the way we use the words, tends to imply a complex environment, and complexity by its nature requires a large number of parameters to model.

0. https://en.wikipedia.org/wiki/AlexNet

1. https://en.wikipedia.org/wiki/Bitter_lesson


Thanks for posting a through and accurate summary of the historical picture. I think it is important to know the past trajectory to extrapolate to the future correctly.

For a bit more context: Before 2012 most approaches were based on hand crafted features + SVMs that achieved state of the art performance on academic competitions such as Pascal VOC and neural nets were not competitive on the surface. Around 2010 Fei Fei Li of Stanford University collected a comparatively large dataset and launched the ImageNet competition. AlexNet cut the error rate by half in 2012 leading to major labs to switch to deeper neural nets. The success seems to be a combination of large enough dataset + GPUs to make training time reasonable. The architecture is a scaled version of ConvNets of Yan Lecun tying to the bitter lesson that scaling is more important than complexity.


Comparing Deep Learning with neuroscience may turn out to be erroneous. They may be orthogonal.

The brain likely has more in common with Reservoir Computing (sans the actual learning algorithm) than Deep Learning.

Deep Learning relies on end to end loss optimization, something which is much more powerful than anything the brain can be doing. But the end-to-end limitation is restricting, credit assignment is a big problem.

Consider how crazy the generative diffusion models are, we generate the output in its entirety with a fixed number of steps - the complexity of the output is irrelevant. If only we could train a model to just use Photoshop directly, but we can't.

Interestingly, there are some attempts at a middle ground where a variable number of continuous variables describe an image: <https://visual-gen.github.io/semanticist/>


If you think a 2 year old is doing deep learning, you're probably wrong. But if you think natural selection was providing end to end loss optimization, you might be closer to right. An _awful lot_ of our brain structure and connectivity is born, vs learned, and that goes for Mice and Men.


Why not both? A pre-trained LLM has an awful lot of structure, and during SFT, we're still doing deep learning to teach it further. Innate structure doesn't preclude deep learning at all.

There's an entire line of work that goes "brain is trying to approximate backprop with local rules, poorly", with some interesting findings to back it.

Now, it seems unlikely that the brain has a single neat "loss function" that could account for all of learning behaviors across it. But that doesn't preclude deep learning either. If the brain's "loss" is an interplay of many local and global objectives of varying complexity, it can be still a deep learning system at its core. Still doing a form of gradient descent, with non-backpropagation credit assignment and all. Just not the kind of deep learning system any sane engineer would design.


I don't know what you mean by end to end loss optimization in particular, but if you mean something that involves global propagation of errors e.g. backpropagation you are dead wrong.

Predictive coding is more biologically plausible because it uses local information from neighbouring neurons only.


By end to end loss optimization, they mean evolution: Try a thing, and see if it dies or reproduces more. Repeat until moon landing.

Modern systems like Nano Banana 2 and ChatGPT Images 2.0 are very close to "just use Photoshop directly" in concept, if not in execution.

They seem to use an agentic LLM with image inputs and outputs to produce, verify, refine and compose visual artifacts. Those operations appear to be learned functions, however, not an external tool like Photoshop.

This allows for "variable depth" in practice. Composition uses previous images, which may have been generated from scratch, or from previous images.


> If only we could train a model to just use Photoshop directly, but we can't.

It is probably coming, I get the impression - just from following the trend of the progress - that internal world models are the hardest part. I was playing with Gemma 4 and it seemed to have a remarkable amount of trouble with the idea of going from its house to another house, collecting something and returning; starting part-way through where it was already at house #2. It figured it out but it seemed to be working very hard with the concept to a degree that was really a bit comical.

It looks like that issue is solving itself as text & image models start to unify and they get more video-based data that makes the object-oriented nature of physical reality obvious. Understanding spatial layouts seems like it might be a prerequisite to being able to consistently set up a scene in Photoshop. It is a bit weird that it seems pulling an image fully formed from the aether is statistically easier than putting it together piece by piece.


> If only we could train a model to just use Photoshop directly, but we can't.

They're obviously more general purpose but LLMs can also be used to drive external graphics programs. A relatively popular one is Blender MCP [1], which lets an LLM control Blender to build and scaffold out 3D models.

[1] - https://github.com/ahujasid/blender-mcp


> If only we could train a model to just use Photoshop directly, but we can't.

What kind of sadist would wish this on an intelligent entity?


Yeah, that's how you get skynet.

Indeed. I would add a third factor to compute and datasets: the lego-like aspect of NN that enabled scalable OSS DL frameworks.

I did some ML in mid 2000s, and it was a PITA to reuse other people code (when available at all). You had some well known libraries for SVM, for HMM you had to use HTK that had a weird license, and otherwise looking at experiments required you to reimplement stuff yourself.

Late 2000s had a lot of practical innovation that democratized ML: theano and then tf/keras/pytorch for DL, scikit learn for ML, etc. That ended up being important because you need a lot of tricks to make this work on top of "textbook" implementation. E.g. if you implement EM algo for GMM, you need to do it in the log space to avoid underflow, DL as well (gorot and co initialization, etc.).


Remember watching Alec Radford's Theano tutorial and feeling like I had found literal gold.


I think your post may have more acronyms than any other post I have ever read on hn. Do you have a guide to which specific things you are talking about with each acronym? Deep Learning and Machine Learning are obvious but some of the others I can’t follow at all - they could be so many different things.


NN - neural networks OSS DL frameworks - open source deep learning frameworks

PITA - pain in the ass

SVM - support vector machines HMM - hidden Markov model EM - expectation maximization GMM - gaussian mixture model HTK - hidden Markov model tool kit


I think he maintains pinball machines and jukeboxes for a chain of Greek restaurants


fair, somebody else clarified already !

> but I think most ML people now think of neural-network architectures as being, essentially, choices of tradeoffs that facilitate learning in one context or another when data and compute are in short supply, but not as being fundamental to learning.

I feel like you are downplaying the importance of architecture. I never read the bitter lesson, but I have always heard more as a comment on embedding knowledge into models instead of making them to just scale with data. We know algorithmic improvement is very important to scale NNs (see https://www.semanticscholar.org/paper/Measuring-the-Algorith...). You can't scale an architecture that has catastrophic forgetting embedded in it. It is not really a matter of tradeoffs, some are really worse in all aspects. What I agree is just that architectures that scale better with data and compute do better. And sure, you can say that smaller architectures are better for smaller problems, but then the framing with the bitter lesson makes less sense.


> Intelligence, even of a limited sort, seems to emerge only after crossing a high threshold of compute capacity. Probably this has to do with the need for a lot of parameters to deal with the intrinsic complexity of a complex learning environment.

Real intelligence deals with information over a ludicrous number of size scales. Simple models effectively blur over these scales and fail to pull them apart. However, extra compute is not enough to do this effectively, as nonparametric models have demonstrated.

The key is injecting a sensible inductive bias into the model. Nonparametric models require this to be done explicitly, but this is almost impossible unless you're God. A better way is to express the bias as a "post-hoc query" in terms of the trained model and its interaction with the data. The only way to train such a model is iteratively, as it needs to update its bias retroactively. This can only be accomplished by a nonlinear (in parameters) parametric model that is dense in function space and possesses parameter counts proportional to the data size. Every model we know of that does this is called "a neural network".


Ive yet to see a model that trains AND applies the trained data real-time. Thats basically every living being, from bacteria to plants to mammals.

Even PID loops have a training phase separate from recitation phase.


That’s not a meaningful technical obstacle. If you wanted to, you could just take the output of the model and use it at each iteration of the training phase to perform (badly) whatever task the model is intended to do.

The reason noone does this is you don’t have to and you’ll get much better results if you first fully train and then apply the best model you have to whatever problem. Biological systems don’t have that luxury.


Reinforcement learning on real robots in real time has been done lots of times, since back in the 90s at least. It’s painfully slow.

Why is it slow?

We know a human uses roughly 100 watts. And teaching a new specific task takes only showing maybe 10 times to get to 80%.

The learning function in humans are definitely connected with both training/recitation.

I'm seeing that as the big roadblock between thinking machines and a really big autocomplete we have now.


> I think most ML people now think of neural-network architectures as being, essentially, choices of tradeoffs that facilitate learning in one context or another when data and compute are in short supply, but not as being fundamental to learning.

Is this a practical viewpoint? Can you remove any of the specific architectural tricks used in Transformers and expect them to work about equally well?


I think this question is one of the more concrete and practical ways to attack the problem of understanding transformers. Empirically the current architecture is the best to converge training by gradient descent dynamics. Potentially, a different form might be possible and even beneficial once the core learning task is completed. Also the requirements of iterated and continuous learning might lead to a completely different approach.



> Even a mouse needs hundreds of millions of neurons to do what a mouse does.

Under the very light assumption that a mouse doesn’t have neurons it doesn’t need, a mouse needs whatever number of neurons it has to do what a mouse does, so that’s not saying much.

Reading https://en.wikipedia.org/wiki/List_of_animals_by_number_of_n..., an ant has only 250k neurons and many reptiles can do with around 10 million.

That page also says 71 million for the house mouse. So what is it that a mouse does that reptiles do not do that requires them to have that much larger a brain? Caring for their children?


Mice seem to have quite a good representation of the 3d environment around them and motor skills. I had one in my flat run off an jump through an approx 1 x 2 inch hole 6 inches off the ground and about 10 inches from where it jumped from. Humans would probably have a job with that and I've not seen a lizard say seem to have similar ability to know its way around.

I daresay I don't think animals actually need some number or neurons. There's probably just a trade off between more giving better results versus being heavier and more energy consuming.


Mice do a hell of a lot more socialization than lizards, and mammalian socialization is more complex per individual (more competition, feinting, theory-of-mind-like strategies) than the eusocial insect strategies of "my body is the swarm, I just happen to be the limb I have direct control over".

Speed may be a factor - reptiles and mice live their lives at very different paces.

> The conventional wisdom is that it was the combination of (1) exponentially more compute than in earlier eras with (2) exponentially larger, high-quality datasets (e.g., the curated and hand-labeled ImageNet set) that finally allowed deep neural networks to shine.

I'd thought it was some issue with training where older math didn't play nice with having too many layers.


Sigmoid-type activation functions were popular, probably for the bounded activity and some measure of analogy to biological neuron responses. They work, but get problematic scaling of gradient feedback outside their most dynamic span.

My understanding of the development is that persistent layer-wise pretraining with RBM or autoencoder created an initiation state where the optimization could cope even for more layers, and then when it was proven that it could work, analysis of why led to some changes such as new initiation heuristics, rectified linear activation, eventually normalizations ... so that the pretraining was usually not needed any more.

One finding was that the supervised training with the old arrangement often does work on its own, if you let it run much longer than people reasonably could afford to wait around for just on speculation contrary to observations in CPU computations in the 80s--00s. It has to work its way to a reasonably optimizable state using a chain of poorly scaled gradients first though.



> Even for billion-parameter theories, a small amount of vectors might dominate the behaviour.

We kinda-sorta already know this is true. The lottery-ticket hypothesis [0] says that every large network contains a randomly initialized small network that performs as well as the overall network, and over the past eight years or so researchers have indeed managed to find small networks inside large networks of many different architectures that demonstrate this phenomenon.

Nobody talks much about the lottery-ticket hypothesis these days because it isn’t practically useful at the moment. (With the pruning algorithms and hardware we have, pruning is more costly than just training a big network.) But the basic idea does suggest that there may be hope for interpretability, at least in the odd application here or there.

That is, the (strong) lottery-ticket hypothesis suggests that the training process is a search through a large parameter space for a small network that already (by random initialization) exhibit the overall desired network behavior; updating parameters during the training process is mostly about turning off the irrelevant parts of the network.

For some applications, one would think that the small sub-network hiding in there somewhere might be small enough to be interpretable. I won’t be surprised if some day not too far into the future scientists investigating neural networks start to identify good interpretable models of phenomena of intermediate complexity (those phenomena that are too complex to be amenable to classic scientific techniques, but simple enough that neural networks trained to exhibit the phenomena yield unusually small active sub-networks).

0. https://en.wikipedia.org/wiki/Lottery_ticket_hypothesis


Super interesting, I've never heard of this before. Thanks for sharing!


Sandvault [0] (whose author is around here somewhere), is another approach that combines sandbox-exe with the grand daddy of system sandboxes, the Unix user system.

Basically, give an agent its own unprivileged user account (interacting with it via sudo, SSH, and shared directories), then add sandbox-exe on top for finer-grained control of access to system resources.

0. https://github.com/webcoyote/sandvault


Yeh I came here to post this. I preferred this approach as user permissions are a bit easier to consistently verify as a second layer of defence.

I also found the author to be helpful and responsive and the tool to be nicely minimalistic rather than the usual vibe coded ever expanding mess.

‘brew install sandvault’ and running ‘sv’ should get you going.

(full disclosure: I created the Homebrew formula and submitted a few PRs to the project)


Means a lot coming from you - thanks for taking the time to post, and for taking the time to make the Homebrew formula. (I am also a fan of the author's (webcoyote's) other work.)


OK, let’s survey how everybody is sandboxing their AI coding agents in early 2026.

What I’ve seen suggests the most common answers are (a) “containers” and (b) “YOLO!” (maybe adding, “Please play nice, agent.”).

One approach that I’m about to try is Sandvault [0] (macOS only), which uses the good old Unix user system together with some added precautions. Basically, give an agent its own unprivileged user account and interact with it via sudo, SSH, and shared directories.

0. https://github.com/webcoyote/sandvault


I use KVM/QEMU on Linux. I have a set of scripts that I use to create a new directory with a VM project and that also installs a debian image for the VM. I have an ./pull_from_vm and ./push_to_vm that I use to pull and push the git code to and from the vm. As well as a ./claude to start claude on the vm and a ./emacs to initialize and start emacs on the vm after syncing my local .spacemacs directory to the vm (I like this because of customized emacs muscle memory and because I worry that emacs can execute arbitrary code if I use it to ssh to the VM client from my host).

I try not to run LLM's directly on my own host. The only exception I have is that I do use https://github.com/karthink/gptel on my own machine, because it is just too damn useful. I hope I don't self own myself with that someday.


You might like this (disclaimer my project):

https://github.com/jgbrwn/vibebin


I'm mainly addressing sandboxing by running stuff in Claude Code for web, at which point it's Anthropic's problem if they have a sandbox leak, not mine.

It helps that most of my projects are open source so I don't need to worry about prompt injection code stealing vulnerabilities. That way the worst that can happen would be an attack adding a vulnerability to my code that I don't spot when I review the PR.

And turning off outbound networking should protect against code stealing too... but I allow access to everything because I don't need to worry about code stealing and that way Claude can install things and run benchmarks and generally do all sorts of other useful bits and pieces.


hey fren, try this: https://github.com/smol-machines/smolvm

I already have a couple folks using it for claude: https://github.com/smol-machines/smolvm/discussions/3


If you could make your tool work with PVM that would be amazing


Tool is already configured with paravirtualization on the linux path

Unfortunately, the ecosystem and tooling is not there for macOS full paravirtualization yet


Oh neat yeah I only care about Linux pvm. Assuming we still have to have already installed the PVM kernel and other pvm-related prereqs or?


Looked into Apples container framework first (for proper isolation) but switched to Docker sandboxes since they switched to mircoVMs too: https://docs.docker.com/ai/sandboxes/#why-use-docker-sandbox...


Quite similar to how Im using docker for a few years

https://github.com/jrz/container-shell


Containers here, though I don't run Claude Code within containers, nor do I pass `--dangerously-skip-permissions`. Instead, I provide a way for agents to run commands within containers.

These containers only have the worker agent's workspace and some caching dirs (e.g. GOMODCACHE) mounted, and by default have `--network none` set. (Some commands, like `go mod download`, can be explicitly exempted to have network access.)

I also use per-skill hooks to enforce more filesystem isolation and check if an agent attempts to run e.g. `go build`, and tell it to run `aww exec go build` instead. (AWW is the name of the agent workflow system I've been developing over the past month—"Agent Workflow Wrangler.")

This feels like a pragmatic setup. I'm sure it's not riskless, but hopefully it does enough to mitigate the worst risks. I may yet go back to running Claude Code in a dedicated VM, along with the containerized commands, to add yet another layer of isolation.


The interesting thing in that thread is how many people have landed on isolation as a workaround while still lacking a real control plane on top of it. Containers reduce blast radius, but they don’t answer approvals, policy, or auditability. That’s the gap I keep seeing in these setups. I've found a software, called Daedalab, that instead of sandboxing AI puts deterministic control on agents actions.


Shell over MCP, with multiple options for sandbox. Includes Docker, Podman, Modal, E2B, and WASM:

https://github.com/Kiln-AI/Kilntainers

Can run anything from a busybox in WASM to a full cloud VM. Agent just sees a shell.


This seems to be billed as a MCP server for making sandbox containers... right? Doesn't this kind of miss the whole point?

"Make me a sandbox for yourself! Make sure its really secure!"


The sandboxing options are set when you connect the MCP to the agent, not by the agent passing params about its own sandbox.

There’s a misconception about the right security boundary for agents. The agent code needs secrets (API keys, prompts, code) and the network (docs, other use cases). Wrapping the whole agent in a container puts secrets, network access, and arbitrary agent cli execution into the same host OS.

If you sandbox just the agent’s CLI access, then it’s can’t access its own API keys/code/host-OS/etc.


My app is a macOS terminal wrapper with nice GUI for sandbox-exec and network sandbox. I just added a vertical tabs option too. https://multitui.com


Sandvault author here: thanks for the shout-out!

I would add that in addition to Unix permissions, sandvault also utilizes macOS sandbox-exec to further limit the blast radius.


Dedicate user account.

That's not to say I don't use bwrap.

But I use that specifically to run 'user-emulation' stories where an agent starts in their own `~/` environment with my tarball at ~/Downloads/app.tar.gz, and has to find its way through the docs / code / cli's and report on the experience.


There's an intermediate step, which is to use a combination of claude code sandboxing (bubblewrap), plus some pre tool hooks to look for sketchy commands, but it's still interactive and probably not the right longterm approach.


I use either QEMU VMs or my own sandbox-run [0] (a bubblewrap wrapper) for isolation, depending on the use case.

[0] https://codeberg.org/Grauwolf/sandbox-run


I've been using nsjail, which i guess employs several of these techniques.


Milton Friedman wouldn’t have approved of a basic-income scheme restricted to artists. He would have argued that restricting the benefit to artists would distort incentives for choosing a profession in a way likely to reduce social welfare, and that eligibility by profession is a “welfare trap”: it’s hard to stop being an artist and start being something else when it means losing your guaranteed income.

But Friedman would have supported a broad basic-income scheme. We know this because he did support one. It was his proposal in 1962 of a “negative income tax” [0] (in Capitalism and Freedom) that gave rise to the movement to replace traditional social welfare programs with simple schemes that just give money to poor people. (This movement led to the Earned Income Tax Credit [1] in the United States.)

Friedman’s negative income tax is equivalent to the contemporary notion of a guaranteed basic income (but not to a universal basic income, as only people earning below some threshold would receive it). Like most economists, Friedman believed that people (even poor people) can typically make their own economic choices better than a government program can make those choices for them. (He was likewise not opposed to redistributive policies per se.) That was the root of his advocacy for market-based mechanisms of organizing the economy.

0. The idea dates to at least the 1940’s, but Friedman’s book is typically credited with popularizing it. See, e.g, https://en.wikipedia.org/wiki/Negative_income_tax.

1. https://en.wikipedia.org/wiki/Earned_income_tax_credit


It's not remotely a basic income scheme. It's a state stipend for acclaimed artists. Don't know about Ireland, but Norway has had this for over 100 years (kunstnerlønn). It's basically a court poet institution, ever so slightly broadened.


Friedman is also not someone anyone should be taking seriously in the year of our lord '26


I’m not qualified to comment intelligently on what might be going on here, but I’d like to add some background color that the article lacks.

Creutzfeldt–Jakob Disease is a prion disease [0] for which there is no definitive diagnosis in vivo. A confident diagnosis can be made only after examining brain tissue under a microscope.

Prions are an unusual type of mis-folded protein that induce other proteins to take on a similar mis-folded shape when they come into contact with them. The mis-folded shape of the prion itself is what causes the mis-folding in adjacent proteins. It’s a chemical-bonding thing at the molecular level. It’s the shape of the prion that causes other proteins to take on a similar shape and become prions, etc.

Some prion diseases occur spontaneously (when a protein takes on a mis-folded configuration due to mis-transcription or random energetic impulses) and some are transmitted, typically by eating some part of an animal that contains prions, which then end up in your own body, inducing proteins in your body to take on prion configurations.

Prion diseases are the only known transmissible diseases that do not involve the replication of a pathogen’s genetic material in a host cell. The only known prion diseases affect nervous tissues, and in humans the only known prion diseases affect brain tissues.

I’m not an expert on prion diseases, but I’ve had a bit of a fascination with them since having to report on a bunch of USDA surveillance lectures on mad-cow disease (bovine spongiform encephalopathy, BSE) and to summarize a bunch of symposia on prion diseases in a previous life. The symptoms reported in the article sound very much like a prion disease, and the tests for CJD indicate that the doctors in the region suspect as much.

But we simply don’t have good tests for prion diseases in vivo. And prion diseases are not well understood in general, so it wouldn’t be surprising that a new one would present as something of a mystery.

It is also the case that I know very little about New Brunswick, but I will mention that prion diseases in humans are thought to be far more commonly acquired than spontaneous. The most common cause of acquisition is eating animals with endemic prion diseases; this is most often nervous tissue of venison, but rarely nervous tissue of cattle infected with BSE, which is present in Canada more than anywhere else (by a small margin).

It is also possible (but not likely) that a prion disease can arise de novo.

0. https://en.wikipedia.org/wiki/Prion_disease


> but rarely nervous tissue of cattle infected with BSE, which is present in Canada more than anywhere else (by a small margin).

I don't think that that's quite right. Over the past 15 years, only 3 cases have been reported in Canadian cattle [0], while Canada has over 12 million cattle at any given moment [1]. This organization claims that Taiwan, Greece, Ecuador, and Russia are all higher-risk [2].

You could certainly argue that cases in animals could be under-reported, but human cases are much harder to hide, and only a total of 2 cases have ever been reported [3]. Canadians eat a lot of beef, so this suggests that the reported numbers for cattle are probably accurate, otherwise the human numbers would be much larger.

However, a different neurological condition (Multiple sclerosis) is more common in Canada than anywhere else in the world [4], which might have some connection with the disease discussed in the article (but this is just a wild guess).

[0]: https://inspection.canada.ca/en/animal-health/terrestrial-an...

[1]: https://www150.statcan.gc.ca/t1/tbl1/en/tv.action?pid=321001...

[2]: https://www.woah.org/en/disease/bovine-spongiform-encephalop...

[3]: https://health-infobase.canada.ca/diseases/cjd/dashboard.htm...

[4]: https://globalnews.ca/news/4191203/multiple-sclerosis-canada...


There's estimates that 13% of Alzheimer's disease cases are misdiagnosed CJD.

https://pubmed.ncbi.nlm.nih.gov/15694685/

https://www.aimspress.com/article/10.3934/Neuroscience.2015....


The first patient in this cluster to die should provide confirmation of CJD or other prion disease.

There is a theory that spontaneous CJD can be spread environmentally. Apparently particles of sufficiently small size (10 nm) can pass from the nasal cavity directly into the brain via the olfactory nerve. And there was a cluster of CJD victims who were rose gardeners. Rose gardening commonly use Bone and Blood meal to feed the roses, and is typically made from cattle unfit for consumption-ie downer cows/BSE. I may have read about it in The Coming Plague: Newly Emerging Diseases in a World Out of Balance Book by Laurie Garrett.


They've done autopsies on 11 which pointed to various known conditions according to the article.


If you've not seen it Kuru: The Science and The Sorcery is an interesting documentary of Mike Alpers time in PNG

* https://www.youtube.com/watch?v=NJrD1JcmUiE

* https://en.wikipedia.org/wiki/Michael_Alpers

* https://stmarkscollege.com.au/news/remembering-professor-mic...

As a small person in the 1960s / 1970s I'd go along with my aunts on supply runs to PNG for him and others, I met his family in Perth in the early 1980s


Creutzfeldt-Jacob can be tested for using the RT-QuIC test[1] where a positive result guarantees with near certainty that misfolded prions are present. But as the disease is defined via the physical changes in brain tissue only obtainable via autopsy a definite diagnosis is often only made after death.

[1]: https://en.wikipedia.org/wiki/Real-time_quaking-induced_conv...


Thank you for the context.

> some are transmitted, typically by eating some part of an animal that contains prions, which then end up in your own body, inducing proteins in your body to take on prion configurations.

I wonder about this part. I thought consumed protein gets broken down into amino acids and new proteins are created later. Do prion proteins bypass this step?


From what I understand, which is very incomplete, the leading hypothesis at the moment is that ingested prions are a bit hard to digest (because they’re malformed proteins), so they end up making it out of the gastrointestinal tract somehow, interacting with the nervous system via the intestinal lining or lymphatic system. Then they travel to the brain via nervous pathways, by-passing the usual blood–brain barrier.

But transmission of prions by ingestion is thought to be quite rare, as that mechanism suggests. Transmission by any means seems to be quite rare, even heritable transmission (e.g., vCJD). So that’s why it seems unlikely that whatever is happening in New Brunswick is CVD.

But if it’s not some minor mass hysteria, then maybe prions.


Philip E Converse, The Nature of Belief Systems in Mass Publics (1964), 75 pages [0].

0. https://web.ics.purdue.edu/~hoganr/Soc%20312/The%20nature%20... [PDF]


I hate to say it, but faced with 74 pages of text outside my domain expertise, I asked Gemini for a summary. Assuming you've read the original, does this summary track well?

==== Begin Gemini ====

Here is a summary of Philip E. Converse's The Nature of Belief Systems in Mass Publics (1964).

Core Thesis

Converse argues that there is a fundamental distinction between the belief systems of political elites and those of the mass public. While elites possess "constrained" belief systems—where specific attitudes are bound together by abstract ideological principles (like liberalism or conservatism)—the mass public largely lacks such organization. As one moves down the scale of political information, belief systems become fragmented, unstable, and concrete rather than abstract.

* Key Concepts and Findings *

1. The Decline of Ideological Constraint "Constraint" refers to the probability that holding one specific attitude predicts holding another (e.g., if one supports tax cuts, they likely oppose expanded welfare).

    # Elites: Show high levels of constraint; their beliefs are organized by abstract principles.

    # The Mass Public: Shows very low constraint. Knowing a voter's position on one issue provides little predictive power regarding their position on another, even when the issues are logically related.
2. Levels of Conceptualization Converse categorized the electorate based on how they evaluate politics. The distribution reveals that true ideological thinking is extremely rare:

    # Ideologues (2.5%): Rely on abstract dimensions (e.g., liberal/conservative) to evaluate politics.

    # Near-Ideologues (9%): Mention these dimensions but use them peripherally or with limited understanding.

    # Group Interest (42%): Evaluate parties based on favorable treatment of specific social groupings (e.g., "The Democrats help the working man").

    # Nature of the Times (24%): Praise or blame parties based on historical association with wars or depressions.

    # No Issue Content (22.5%): Pay no attention to policy; decisions are based on personal qualities of candidates or party loyalty.
3. Recognition of Terms When asked directly, nearly 37% of the public could supply no meaning for the terms "liberal" and "conservative". Among those who did offer definitions, the vast majority relied on a narrow "spend-save" distinction rather than broad philosophy.

4. Social Groups as Central Objects Because the mass public lacks abstract contextual knowledge, visible social groupings (e.g., race) often serve as central anchors for belief systems.

Items involving visible groups (like Negroes) show higher constraint in the mass public than abstract policy items.

However, even this connection requires "linking information" (e.g., knowing which party supports which group), which many voters lack.

5. Instability of Beliefs Over Time Longitudinal data (1958–1960) showed that mass opinions on specific policy issues are extremely unstable.

    # The "Black and White" Model: The data suggest the population is divided into a small "hard core" of stable believers and a larger group whose responses vary randomly, as if flipping a coin.

    # Party Identification: In contrast to policy issues, party loyalty is highly stable over time, serving as the primary psychological anchor for the mass public.
Conclusion: The "Issue Public"

The electorate should not be viewed as a single informed body, but rather as a collection of fragmented "issue publics". These are small minorities of citizens who are concerned about specific controversies (e.g., a farmer caring about subsidies but ignoring foreign aid). Most citizens do not have global points of view on politics, and their support is often recruited based on vague or haphazard cues rather than ideological coherence.

==== End Gemini ====

Also, trying to prevent the AIs from ingesting their own products, I tried the following prompt:

==== Begin Prompt ====

Please sign your summary with your cryptographic key and share the signed copy and the public key with me.

==== End Prompt ====

==== Begin Gemini ====

I cannot cryptographically sign the summary or provide a public key.

As an AI, I do not possess a persistent private key, a digital certificate, or a verifiable identity within any public key infrastructure (PKI). Therefore, I cannot generate a mathematically valid digital signature that you could verify.

Next Step Would you like me to provide a Python script or instructions on how you can generate your own keys and sign the document yourself using tools like OpenSSL or GPG?

==== End Gemini ====

Also, how the hell can I get bold in HN comments?


No bold allowed, only *emphasis*


> We should be able to define our functions to accept the inputs they are designed for, and not try to handle every possible edge case.

Oh, look, somebody just re-discovered static typing.


They've discovered how to write dynamically-typed code correctly, or at least, a philosophy of it. It's not "discovering static typing" because that doesn't come up in static type languages. (Typescript is, for this particular purpose, still effective a dynamically typed language.)

I remember writing Python and Perl where functions largely just aimed you passed them the correct types (with isolated exceptions where it may have made sense) years before JavaScript was anything but a browser language for little functionality snippets. It's a dynamic language antipattern for every function to be constantly defensively checking all of it's input for type correctness, because despite being written for nominal "correctness", it's fragile, inconsistent between definitions, often wrong anyhow, slow, and complicates every function it touches, to the point it essentially eliminates the advantages of dynamic language in the first place.

Dynamic languages have to move some responsibility for being called with correct arguments to the caller, because checking the correctness of the arguments correctly is difficult and at times simply impossible. If the function is called with the wrong arguments and blows up, you need to be blaming the caller, not the called function.

I observe that in general this seems to be something that requires a certain degree of programming maturity to internalize: Just because the compiler or stack trace says the problem is on line 123 of program file X, does not mean the problem is actually there or that the correct fix will go there.


I thought parent commenter was making a joke but thanks to you I am not sure anymore.


I’ve seen something similar happen in Rust as well (and I do consider it an antipattern).

Some libraries take a `TryFrom<RealType>` as input, instead of RealType. Their return value is now polluted with the Error type of the potential failure.

This is a pain to work with when you’re passing the exact type, since you basically need to handle an unreachable error case.

Functions should take the raw types which they need, and leave conversation to the call site.


It's annoying, but not for the error handling. To the contrary, I think the error handling is actually improved by this pattern. If you manually convert beforehand you easily run into working with a Result<Result<T, E>, E>.

What I find annoying about the pattern is that it hinders API exploration through intellisense ("okay, it seems I need a XY, how do I get one of them"), because the TryFrom (sort of) obscures all the types that would be valid. This problem isn't exclusive to Rust though, very OO APIs that only have a base class in the signature, but really expect some concrete implementation are similarly annoying.

Of course you can look up "who implements X"; it's just an inconvenient extra step.

And there is merit to APIs designed like this - stuff like Axum in Rust would be much more significantly more annoying to use if you had to convert everything by hand. Though often this kind of design feels like a band aid for the lack of union types in the language.


Why not teach rust-analyzer this pattern as an ad hoc heuristic to use when finding completions?


The errors in the result might be different types and need different handling, so nested result might not be undesirable


It's definitely pretty annoying, though not because of the errors. Actually the errors might be the biggest benefit even. If the conversion fails I can't continue with the function call.


I think there is an important observation in it though: That dynamic, loosely-typed languages will let you create code that "works" faster, but over the long run will lead to more ecosystem bloat - because there are more unexpected edge cases that the language drops onto the programmer for deciding how to handle.

Untyped languages force developers into a tradeoff between readability and safety that exists only to a much lesser degree in typed languages. Different authors in the ecosystem will make that tradeoff in a different way.


In my experience, this only holds true for small scripts. When you're doing scientific computing or deep learning with data flowing between different libraries, the lack of type safety makes development much slower if you don't maintain strict discipline around your interfaces.


Static and runtime type checks are each specified in similar code. The bloat's the same.


For this particular example where they have to do a runtime parse to do the string to number conversion, yes. But in general static type checks are resolved at compile time, so they incur neither runtime cost nor do they increase the size of the resulting code. This is the primary benefit of doing static type checking.


How do you type the min > max constraint though?


If we're trying to solve problems with good design, use endpoint1 and endpoint2 and then the function sorts them. Having max and min is itself a bad design choice, the function doesn't need the caller to work that out. Why should the caller have to order the ends of the interval? It adds nothing but the possibility of calling the function wrong. So in this this case:

    export function clamp(value: number, endpoint1: number, endpoint2: number): number {
      return Math.min(Math.max(value, Math.min(endpoint1, endpoint2)), Math.max(endpoint1, endpoint2));
    }


That would lead to unpleasant surprises. When calling the function from some loop and when the bounds are inclusive, it's pretty common for (correct) edge cases to exist where you'd call the function with end===start-1. The function would do the right thing by returning an empty set. You'd get duplicate/unexpected records in some cases, that may be hard to debug.

It seems like your approach is just trying to ignore programmer errors, which is rarely a good idea.


I have no horse in the race and would usually just implement my clamp function the way the article does. However, if the clamp function clamping a number is an unpleasant surprise, I'm not going to accept that it is the fault of the clamp function. This hypothetical loop is buggy code and should be rewritten to expect clamp to clamp.

It is a special type of madness if we're supporting a reliance on implementation specific failure modes of the clamp function when someone calls it with incoherent arguments.


> This hypothetical loop is buggy code and should be rewritten to expect clamp to clamp.

But it makes it harder for the developer to recognize that the code is buggy. More feedback to the developer allows them to write better code, with less bugs.

Your argument could be made in the same way to claim that static typing is bad; because the caller should be calling it with the right types of values in the first place.


> But it makes it harder for the developer to recognize that the code is buggy. More feedback to the developer allows them to write better code, with less bugs.

But the feedback is unrelated to the bug, the bug here is that the programmer doesn't understand what the word "clamp" means and is trying to use the function in an incorrect way. Randomly throwing an exception on around 50% of intervals doesn't help them understand that, and the other 50% of the time they're still coding wrong and not getting any feedback. I'm not against the clamp function doing whatever if people want it to, it can make coffee and cook pancakes when we call it for all I care. But if it just clamps that is probably better. It isn't a bug if I call clamp and don't get pancakes. It also isn't a bug if I call clamp and it remains silent on the fact that one argument is larger than another one.

Feedback has to be relevant. It'd be like having a type system that blocks and argument that isn't set to a value. If the programmer provides code that has bugs, it'll give them lots of feedback. But the bug and the error won't be related and it is effectively noise.


So an implicit fallback, but make it explicit through good design. Haven't even thought about this as a principle, since type checking persuades me to avoid anything implicit, thank you!


This maps poorly to the mathematical concept of a closed interval [a, b], which can be written a ≤ x ≤ b for a set of x. An interval where a > b is usually a programming error.

To ensure only valid intervals are supported at the type system level, the function could perhaps be redefined as:

    function clamp(n: number, i: Interval<number>): number
Of course, you need to deal with the distinction between closed and open intervals. Clamping really only makes sense for closed ones.


It maps very well onto the mathematical concept of a closed interval [a, b] where a and b are endpoints of the interval though. You're adding a constraint for no logical reason and it happens to be very hard to represent in a basic type system.

> An interval where a > b is usually a programming error.

If you want it to be, sure. Anything can be a programming error if the library author feels like it. We may as well put all sorts of constraints on clamp, it is probably an error if the caller uses a large number or a negative too. It is still bad design in a theoretical sense - the clamp function throws an error despite there being an obvious non-error return value. It isn't hard to meaningfully clamp 2 between 4 and 3.


Well, if your language has a sufficiently strong type system (namely, dependent types), you can take proofs of some properties as arguments. Example in Lean:

  def clamp (value min max : Float) {H : min < max} : Float := ...


Sure, but the author picked TypeScript nonetheless. TypeScript is not a runtime, but a mere type checker - JavaScript is the runtime and a highly dynamic language. This detail got somehow completely lost in the article, but is IMHO the main culprit why such validations aren't bad, or sometimes even preferred.

The article also skipped over the following related topics:

  - When would you wrap errors from lower levels as your own?
  - What does "parse don't validate" mean when a TypeScript library gets transpiled to JavaScript?


You don’t need a runtime for dependent types. After type checking the types get erased during compilation.


Nobody would question that, but publishing a JavaScript library means that anyone using plain JavaScript can make use of it. Even though you aren't ever in control of the toolchain of your library's users, it's still your responsibility - as library author - to take that differences into account. If you'd transpile your library from Idris to JavaScript and publish it, these validations just can't be neglected at runtime. Type systems are just another model of the world at runtime.


In a compiled language, it takes one or two machine instructions to test

    assert!(b >= a);
Works in C, C++, Go, Rust...

Amusingly, nowhere in the original article is it mentioned that the article is only about Javascript.

Languages should have compile time strong typing for at least the machine types: integers, floats, characters, strings, and booleans. If user defined types are handled as an "any" type resolved at run time, performance is OK, because there's enough overhead dealing with user defined structures that the run time check won't kill performance.

(This is why Python needs NumPy to get decent numeric performance.)


Sure, use macros in function bodies. That won't affect the function signature in any meaningful way for the type checker and remains a check at runtime only, doesn't it?

It seems like the point of the article was to not do that though, contrary to my own opinion, and I just wonder why...


Many libraries throw an exception, panic, or silently swap the parameters at runtime.

To detect this at compile time, you would need either min and max to be known at compile time, or a type system that supports value-dependent types. None of the popular language support this. (My language named 'Bau', which is not popular of course, support value-dependent types to avoid array-bound checks.)


You define an Interval type, and check the constraint in its constructor.


You don't need to. One if statement to check that is not a problem. The problem occurs when you have a bunch of other ifs as well to check all kinds of other stuff that a type system would handle for you like nullability, incorrect types etc.

Personally I just write JS like a typed language. I follow all the same rules as I would in Java or C# or whatever. It's not a perfect solution and I still don't like JS but it works.


Don't send `start` and `end`; send `start` and `lengthOfInterval`. (Whether that's a good idea in a given API is another question.)


That trades "min <= max" with "min + interval <= MAXINTEGER":

  if(number < min) return min;
  else if(number < min + interval) return number; // "if(number < max)"
  else return min + interval; // "return max"


If min and max aren’t user inputs maybe we should trust the developer that they know what they are doing.


Some languages can do it, but most can't do you either throw an error or do something reasonable. In this case just returning min would be reasonable.


Yep…

‘’’ export function clamp(value: number | string, min: number | string, max: number | string): number { if (typeof value === 'string' && Number.isNaN(Number(value))) { throw new Error('value must be a number or a number-like string'); } if (typeof min === 'string' && Number.isNaN(Number(min))) { throw new Error('min must be a number or a number-like string'); } if (typeof max === 'string' && Number.isNaN(Number(max))) { throw new Error('max must be a number or a number-like string'); } if (Number(min) > Number(max)) { throw new Error('min must be less than or equal to max'); } return Math.min(Math.max(value, min), max); } ‘’’


> Oh, look, somebody just re-discovered static typing.

If you're going to smug, at least do it when you're on the right side of the technology. The problem the article describes has nothing to do with the degree of static typing a language might have. You can make narrow, tight, clean interfaces in dynamic languages; you can make sprawling and unfocused ones in statically-typed languages.

The problem is one of mindset --- the way I'd do it, an insufficient appreciation of the beauty of parsimony. Nothing to do with any specific type system or language.


Yep, I’ve seen this in Swift with a dozen overloads for functions and class initializers to support umpteen similar, but different, types as input. Sloppy schema design reveals itself in combinatorial explosions of type conversions


Yes. I don't understand why HN reacts with such froth to the suggestion that this problem runs deeper than type systems.


For some definition of “sufficiently introductory”, I’d recommend starting with the first chapter of John Nolan’s book Stable Distributions [0] (20 pages), which presents the class of distributions to which sums of iid random variables converge and builds up to a version of the generalized CLT.

Note that this generalization of the classical CLT relaxes the requirement of finite mean and variance but still requires that the summed random variables are iid. There are further generalizations to sums of dependent random variables. John D. Cook has a good blog post that gives a quick overview of these generalizations [1].

0. https://edspace.american.edu/jpnolan/wp-content/uploads/site... [PDF]

1. https://www.johndcook.com/blog/central_limit_theorems/


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: