Hacker News new | past | comments | ask | show | jobs | submit | codekilla's comments login

The author seems to be of the opinion that the creators of DeepSeek will either be unable to, or will not see the value of optimizing the 'second stage' RL component of the 'new' (post pre-training RL) way of training frontier foundation models. Every competent programmer in China is now looking for low level ptx optimizations for EVERY SINGLE STAGE of the pipeline. They will now, likely not publish any of it.


There isn't a "the pipeline". You'd have to work at DeepSeek for your low-level work to affect it.


Peer (or any type of institutional) review needs to be implemented at the national level--same as the funding for the original research. Why would you pay for research and not check that it is correct? Congress needs to fund a new science agency that explicitly does this. I have suggested before that part of graduate training should be replicating select studies that are published (a national review board could select those that seem the most high impact). State-funded schools could take this on, and students would probably learn at least as much doing this as they do in their other studies.


What about all the funding that's not done at a national level? There's still a large amount of research funding that is private.


Sure, and that's on the private funders to ensure they are getting what they pay for. Google pays for plenty of research--since they are the payee, it's their responsibility to ensure it's accuracy to whatever degree satisfies them. Institutions like the FDA are supposed to regulate private research when it comes to market (.i.e pharmaceuticals and the like). Whether or not the FDA and related agencies are effective is a different, but just as important question. Taxpayers desperately need a formal, funded system to verify the science they are paying for--particularly for biomedical research where the incentives for fraud are so high.


I think that's reasonable for the "institutional review" aspect of the OP. But regarding the "peer review" aspect, I don't think it works, at least in the bulk of the current framework. Peer-review is typically performed by separate organizations, independent of the funding organization. To expect a national-level organization to essentially take over the duties of peer-review journals is a very big ask (not in small part because the current system benefits from free labor from the reviewers).


> To expect a national-level organization to essentially take over the duties of peer-review journals is a very big ask

Fair, but what is the alternative that would actually work? What is the budget of all of the journals compared to the NSF+NIH? Is medical research that is true, and certainly actionable worth as much as an F whatever fighter jet? People will have to decide.


The tradeoff is a different arguement and a digression, so while it's likely we agree, I'll side-step it here. I do think it's an uphill battle to pursue a massive reappropriation of funds though.

There are some alternatives that I'm aware of. Here's a few:

1) One is to allow journals to focus on less-than-great results. Right now the focus is on novelty, so there is an incentive to show that your work has some new, great outcome. But there's also value in showing "Hey, we thought this idea had legs but it turns out it didn't." Publishing that work should be part of science but right now its not. (As a side benefit, you could prevent a lot of researchers wasting effort on the same idea simply because they weren't aware that other people already tried, and failed.)

2) Journals can put a premium on sharing your data and code during the review process. Right now, it's often just up to the author and there are lots of veils to hide behind that essentially give the impression of sharing data, but not in a very useful way.

3) Give value to replicating work. Maybe not as much prestige as creating new work, but showing that it can be replicated obviously has value to society as a whole. Most of the time this won't get published, except in the cases where it's sensationalized, like fraud. (This effect is related to #1)

4) Journals can do a better job vetting their reviewers. They struggle to get timely reviews and reach to anyone who accepts the duty. Reviewers may agree to review something they have little background in, and as a result, it's easier to skirt bad articles through the system.


I don’t disagree with any of these points, it’s just I’ve been involved in open science circles, where these things are always mentioned, and I just don’t see any material progress (maybe I’m not looking that closely though). I think the reason for the lack of progress is mainly funding—so until someone gets serious about funding (billionaires or taxpayers), it just seems like the same merry go around. It’s very expensive to replicate biomedical studies—but it’s the only thing that works. Maybe the tide is turning though and simply incentivizing/protecting grad students to become whistleblowers will do more good, but I fear this case was more the exception than the rule.


I think the difference is between a ground-up or top-down approach. Maybe both are needed. My current stance is that while a top-down approach would work, there's very little chance of it happening. For one, government research funds have largely flat-lined in the last 20 years, and expecting them to take on more costs for managing peer review would likely exacerbate the problem. I also don't see the govt clamoring for additional administrative burden. I don't think replication has to be the only method (although I think it's probably the best). Opening the data to the public can do a lot to suss out bad practices or outright fraud, as we saw with the Ariely situation. The progress has been slow, for sure, but I think there is some. For example, there are now journals that specialize in publishing "non-surprising" results.


Peer review can't catch manipulated data. Replication studies do, that's a whole separate concern.


For dependent types, I would look at Idris [1]. Adding Univalence in a satisfying way is I think still somewhat of a research question (I could be wrong, and if anyone has any additional insight would be interested to hear), i.e. see this thread about Univalence in Coq [2]. There are some implementations in Cubical Type Theory, but I am not sure what the state of the art is there [3]

[1]https://www.idris-lang.org [2]https://homotopytypetheory.org/2012/01/22/univalence-versus-... [3]https://redprl.org


This is not going to catch on any time soon.

Assume that regular software has, on average, one bug every 50 lines. (All numbers made up on the spot, or your money back.) Let's suppose that Idris can reduce that to absolutely zero. And let's suppose that totally-working software is worth twice as much as the buggy-but-still-mostly-working slop we get today.

But Idris is harder to write. Not just a bit harder. I'd guess that it's maybe 10 times as hard to write as Javascript. So we'd get better software, but only 1/10 as much of it. Take your ten most favorite web applications or phone apps. You only would have one of them - but that one would never crash. Most people won't make that trade. Most companies that produce software won't make it, either, because they know their customers won't.

Well, you say, what about safety-critical software? What about, say, airplane flight control software? Surely in that environment, producing correct software matters more than producing it quickly, right?

Yes, but also you're in the world of real-time embedded systems. Speed matters, but also provably correct timing. Can you prove that your software meets the timing requirements in all cases, if you wrote it in Idris? I believe that is, at best, an unsolved problem. So what they do is they write in carefully chosen subsets of C++ or Rust, and with a careful eye on the timing (and with the help of tools).


I've been dabbling with Idris and agda and coq. I think I'm pretty much settling on agda, because I can appeal to Haskell for help. It's tough finding things that aren't just proofs, actually running a program isn't hard, there just doesn't seem to be many people who do it. I've got some toy projects in mind, and I'm going to lean hard on https://github.com/gallais/aGdaREP (grep in agda). I can't tell you if it's ten times harder - that seems high. It's different, sure. I'm having a tougher time than with, say, prolog. But most of the bumps and bruises are from lack of guidance around, uh, stuff.

So given that context, it doesn't sound to tough to add a cost to the type for each operation, function call, whatever, and have the type checker count up the cost of each call. So you'd have real proof that you're under some threshold. I wouldn't put the agda runtime on a flight control computer. But I think I could write a compiler, now, For like a microcontroller that would count up (or spend time budget, doesn't matter).

A more sophisticated computer would be way way harder, and be resource efficient. But if you modeled it as "everything's a cache miss" and don't mind a bunch of no-ops all the time, that would be a pretty straightforward adaptation of the microcontroller approach.


I would recommend trying Lean4 because I think it is better suited to programming. Lean has Rust-like toolchain manager; a build system (cf. `.agda-lib`); much more developed tactics (including `termination_by`/`decreasing_by`); more libraries (mathlib, and some experimental programming-oriented libraries for sockets, web, games, unicode...); common use of typeclasses in stdlib/mathlib; `unsafe` per declaration (cf. per module in Agda); sound opaque functions (which must have a nonempty return type) used for `partial` and ffi; "unchained" do-notation (early `return`, imperative loops with `break`/`continue`, `let mut`); easier (more powerful?) metaprogramming and syntax extensions. And in Agda you can't even use Haskell's type constructors with type classes (ex. monad polymorphic fns, and that makes it more difficult to make bindings to Hs libs, than to C libs in Lean).

There are features in Agda/Idris (and probably Coq, about which I sadly know almost nothing) that are absent from Lean and are useful when programming (coinduction, set omega, more powerful `mutual`, explicit multiplicity, cubical? etc), but I'd say the need for them is less common.


> So we'd get better software, but only 1/10 as much of it

How much software do you think we need? 10x less sounds about right to me.


You can produce formally verified software with more expense than Java-Scripting. But is it software that does what we need it to do?

Verification proves that software works according to its "spec". But is the spec "correct"?

Writing NON-formally-verified software 10 x faster means you have a much better chance of figuring out how you should fix your spec.


It should be possible to embed cubical type theory in Idris.[1][2]

1: https://arxiv.org/abs/2210.08232

2: https://chat.openai.com/share/aadb7a0a-08a4-4951-b877-cb2f61...


Bob Harper wrote a really good blog entry that expounds on this as Computational Trinitarianism [1].

Michael Shulman also wrote about the extension to Homotopical Trinitarianism [2]

For a good summary with links there is [3]

[1] Computational Trinitarinism, https://existentialtype.wordpress.com/2011/03/27/the-holy-tr...

[2] Homotopical Trinitarinism, http://home.sandiego.edu/~shulman/papers/trinity.pdf]

[3] nCatLab, https://ncatlab.org/nlab/show/computational+trilogy


Lacking time to study, is this basically what Phil Wadler has written and talked about, like "Propositions as types"?


To the Mr. Harper's observation we should add that the trinitary theory of computation needs to express itself in reality and it does so thru the quadrant of hardware: transistors, memory, electricity and machine code. Thus three, standing on the four, represents computation in action.


In principle it doesn't have to be transistors specifically. Any switching device will suffice.


How would you characterize/contrast Regent vs. Chapel? What do you see are the main drawbacks/benefits of each?


Chapel's killer feature, in my opinion, is being able to take something that looks like a simple loop and turn it into distributed code (via domain maps). To the extent that you can write your program in terms of that feature, you can get very clean programs. But once you step outside of that feature, you've basically back to full-on parallel programming (i.e., explicit PGAS programming with nearly every parallel programming construct under the sun). Chapel offers a nice syntax, but it's semantically not so different from writing SHMEM or MPI or one of these other explicitly parallel programming models.

Regent does also support loop auto-parallelization, though it's not the focus of the language and not generally how people write idiomatic Regent programs. Regent fundamentally is a task-based programming model. "Task" is a fancy word for a function that can run in parallel. The key is that (a) tasks execute with sequential semantics, and (b) inside of a task, you can do basically whatever you want. The compiler doesn't need to analyze the code aside from verifying that you're passing data around correctly. This means the set of programs you can write in the "nice" subset of the language is much, much larger. The vast majority of Regent programmers never encounter any explicitly parallel programming constructs, there is no way to make code that deadlocks or races, etc. On the other hand, organizing programs in terms of tasks does still take effort and a degree of cognitive shift. You still have to divide the program into parts that can be parallelized, even if you're not responsible for parallelizing them.


Thanks, this is helpful. It seems like (based on your reply) there are people successfully using Regent for scientific computing (I'm assuming); do you think the language is a viable choice for industry, or are there particular milestones you're looking reach?


Yes, we're focus mostly on scientific computing. One of our major apps is S3D, which we ported to Regent from Legion C++ [1]. This has been the first time the domain scientists have been able to modify the code themselves; previously the C++ was too much for them. There are other examples of Regent apps elsewhere in this thread.

If by "industry" you mean in areas related to HPC, then Regent is likely to be applicable to what you want. The further you get away from HPC, the less likely that would be. You probably wouldn't use Regent to write a web server, though it's not impossible....

Right now my biggest item is making sure we support all the DOE supercomputers, which means adding support for Intel GPUs. (Regent currently supports NVIDIA and AMD.)

[1]: https://theory.stanford.edu/~aiken/LegionRetreat22/slides/ch...


Thanks, yes, I was thinking along the lines of HPC type applications in industry.


Bought current house without agent (very desirable part of Los Angeles). Selling now without agent. If you buy with an agent you put yourself at a disadvantage because the selling agent will need to split the commission (typically 2.5% a piece or so). When you make an offer on a home without a buying agent, suddenly your offer looks a lot more attractive to the selling agent, who is the only point of contact the seller has into what is happening with their property in terms of offers. People wonder how we got our house so cheap—bank on the real estate agents being greedy. They are the worst, period. I have not met a single one who will not double end a deal in 10 years in the LA market. Not sure how the current sale will go, but I will not work with an agent, I’ve dealt with too many to make that mistake.


I’ve seen this happen many times in LA. Never in SF, and once in Oakland. I don’t have much experience outside California but helped friends in Chicago buy a condo. It definitely helped that they didn’t have an agent and leveraged the listing agent.

Not having an agent is generally a big asset when buying a home, much more than anything an agent will bring to you.


>definitely helped that they didn’t have an agent and leveraged the listing agent.

Ooof. Gotta be careful with this one and understand agency law in your respective state. That listing agent may not actually be working for you the buyer, even if they help you fill out the paperwork.


They're not working for the seller either, they're working for themselves. They are not a fiduciary to any party. You can get screwed by an agent regardless of who hired them.


>If you buy with an agent you put yourself at a disadvantage because the selling agent will need to split the commission

I can’t speak to California law, but this isn’t explicitly true in either market where I work. Non-agency is a thing in some places, and depending on the terms of the listing contract the listing agent might pocket both sides regardless if there isn’t a buyer’s agent.

I’d also argue that there are a lot of properties where a buyer benefits from expertise on the part of an agent - either negotiating strategies or local market concerns. In my market for example understanding environmental and construction issues and value add that a buyer won’t know without doing meaningful research on their own.


The Quantenkoffer by qutools is an interesting kit for learning about quantum stuff [1]

Also, Qiskit pulse was fun to play around with (haven't in awhile, don't know what the current capabilities are) [2]

[1] https://qutools.com/quantenkoffer_science-kit/

[2] https://qiskit.org/documentation/apidoc/pulse.html


Absolutely spot on. I actually do algorithm design, usually over a period of weeks (at least), and leet code is a joke for the serious algorist (I’m sure I’d fail an interview based on it). Nothing has so clearly illustrated the robotic nature of the leet code expert quite like this result has.


I see several argue that if you are poor at leetcode then you cannot be a serious algorist. Can you explain why that argument is wrong?


Do tell!

What sort of job leads you do design algorithms?


Mathematical Biology/Bioinformatics. We have to think very carefully about every step in the process of extracting information from large, diverse datasets--often writing things from scratch, combining/transforming things in novel ways, and implementing new mathematical ideas efficiently enough to be computable on large datasets.


Try Eugenia Chang’s new book, The Joy of Abstraction. Riehl’s book is very technical.


I second this recommendation; Cheng's book is probably the most approachable of the recent wave of category theory texts which does not assume the whole edifice of abstract algebra as a starting point.

Most CT texts introduce categories around page 1, and the Yoneda lemma a handful of pages in. Cheng's book builds up intuitions until chapter 8, where categories are defined over the course of several pages, and supposedly (I'm not there yet) ends the book with the Yoneda lemma.

You might think this means the book is mostly fluff. First: if you read it and think that, you're probably a more advanced reader than she's targeting. Second: goodness no, it's packed to the brim with categorical intuitions; there's a whole way of thinking that she's trying to motivate. Categories are just a formalization of this way of thinking; if you're not onboard with the thinking, the formalization is going to be hollow to you no matter what.

Do recommend.


Thanks I will look at it.


Something like this: https://en.wikipedia.org/wiki/Theater_of_All_Possibilities, where they were guided by the idea we could be artists, explorers, and scientists.


Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: