Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Interestingly a lot of the math and physics people in the ML community are considered "grumpy researchers." A joke apparent with this starter pack[0].

From my personal experience (undergrad physics, worked as engineer, came to CS & ML because I liked the math), there's a lot of pushback.

  - I've been told that the math doesn't matter/you don't need math.
  - I've heard very prominent researchers say "fuck theorists" 
  - I've seen papers routinely rejected for improving training techniques with reviewers say "just tune a large model"
  - I see papers that show improvements when conditioning comparisons on compute restraints because "not enough datasets" or "but does it scale" (these questions can always be asked but require exponentially more work)
  - I've been told I'm gatekeeping for saying "you don't need math to make good models, but you need it to know why your models are wrong" (yes, this is a reference)
  - when pointing out math or statistical errors I'm told it doesn't matter
  - and much more. 
I've heard this from my advisor, dissertation committee, bosses[1], peers, and others (of course, HN). If my experience is short of being rare, I think it explains the grumpy group[2]. But I'm also not too surprised with how common it is in CS for people to claim that everything is easy or that leet code is proof of competence (as opposed to evidence).

I think unfortunately the problem is a bit bigger, but it isn't unsolvable. Really, it is "easily" solvable since it just requires us to make different decisions. Meaning _each and every one of us_ has a direct impact on making this change. Maybe I'm grumpy because I want to see this better world. Maybe I'm grumpy because I know it is possible. Maybe I'm grumpy because it is my job to see problems and try to fix them lol

[0] https://bsky.app/starter-pack/roydanroy.bsky.social/3lba5lii... (not perfect, but there's a high correlation and I don't think that's a coincidence)

[1] Even after _demonstrating_ how my points directly improve the product, more than doubling performance on _customer_ data.

[2] not to mention the way experiments are done, since it is stressed in physicists that empirics is not enough. https://www.youtube.com/watch?v=hV41QEKiMlM



Is this in academia?

Arguably, the emergence of quant hedge funds and private AI research companies is at least as much a symptom of the dysfunctions of academia (and society's compensation of academics on dimensions monetary and beyond) as it is of the ability of Wall Street and Silicon Valley to treat former scientists better than that.


  > Is this in academia?
Yes and no. Industry AI research is currently tightly coupled with academic research. Most of the big papers you see are either directly from the big labs or in partnership. Not even labs like Stanford have sufficient compute to train GPT from scratch (maybe enough for DeepSeek). Here's Fei-Fei Li discussing the issue. Stanford has something like 300 GPUs[1]? And those have to be split across labs.

The thing is that there's always a pipeline. Academic does most of the low level research, say TRL[2] 1-4, partnerships happen between 4-6, and industry takes over the rest. (with some wiggleroom on these numbers). Much of ML academic research right now is tuning large models, made by big labs. This isn't low TRL. Additionally, a lot of research is rejected for not out-performing technologies that are already at TRL 5-7. See Mamba for a recent example. You could also point to KANs, which are probably around TRL 3.

  > Arguably, the emergence of quant hedge funds and private AI research companies is at least as much a symptom of the dysfunctions of academia
Which is where I, again, both agree and disagree. It is not _just_ a symptom of the dysfunction of academia, but _also_ industry. The reason I pointed out the grumpy researchers is because a lot of these people have been discussing techniques that DeepSeek used, long before they were used. DeepSeek looks like what happens when you set these people free. Which is my argument, that we should do that. Scale Maximalists (also alled "Bitter Lesson Maximalists", but I dislike the term) have been dominating ML research, and DeepSeek shows that scale isn't enough. So will hopefully give the mathy people more weight. But then again, is not the common way monopolies fall is because they become too arrogant and incestuous?

So mostly, I agree, I'm just pointing out that there is a bit more subtly and I think we need to recognize that to make progress. There are a lot of physicists and mathy people who like ML and have been doing research in the area but are often pushed out because of the thinking I listed. Though part of the success of the quant industry is recognizing that the strong math and modeling skills of physicists generalize pretty well and you go after people who recognize that an equation that describes a spring isn't only useful for springs, but is useful for anything that oscillates. That understanding of math at that level is very powerful and boy are there a lot of people that want the opportunity to demonstrate this in ML, they just never get similar GPU access.

[0] https://www.ft.com/content/d5f91c27-3be8-454a-bea5-bb8ff2a85...

[1] https://archive.is/20241125132313/https://www.thewrap.com/un...

[2] https://en.wikipedia.org/wiki/Technology_readiness_level




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: