Hacker News new | past | comments | ask | show | jobs | submit login

It's a minimum standard, though. Of course the goal is reproducibility from a broader point of view, but that's not an excuse to do research in a one-off way where nobody is able to show how to get those numbers again, after a year or so from publication.

The coding standards are often abysmally, unexpectedly terrible. Often not even the help of the original authors is enough to be able to produce the same figures from a paper because things and settings and commands get forgotten. Some part of the analysis was done in one language, another part in Excel. Some of the code has now disappeared. Some of the libraries are no longer working. Some people left and their academic storage space was wiped and therefore the intermediate steps and results or notes are deleted. You wouldn't believe it.

Once a paper is published researchers are not really incentivized to document things or maintain the materials. They got the publication, they put it on their CV. On to the next project! No time to waste on work that's already completed. New work leads to new publications, messing around with the old code for the sake of a potential later person interested in it is a waste from the point of view of a researcher, career wise. Also most papers are never attempted to be reproduced ever.




The researcher's job is to properly do an experiment and document it in a paper. If we require more than this, then we will incur in damage to the scientific process for two reasons: (1) companies are not willing to make available software developed by their researchers, therefore they will publish even less; and (2) universities don't have money and staff to produce and maintain software at these standards, so professors will be required to publish less papers.


"These standards" are pretty low. Currently it's a free-for-all chaos. Theoretically papers are reproducible from the documentation found in the paper but that is a lie. It is never reproducible just from the paper. Lots of stuff is done in the background that is not known to the reader. For all we know, they can even tweak their numbers to be 2% better and if someone can't get the results of the paper from the released code, the authors can just ignore it or say, the problem is not on their side, or that the paper numbers were generated with a slightly different code than the released version etc. I've seen this many times on Github, issues getting closed or deleted without comment etc. There is zero accountability.

It's slowly changing though but many people are grinding their teeth, because they can't torture the data as much if things are out in the open.


> "These standards" are pretty low. Currently it's a free-for-all chaos.

I disagree. It is not perfect, but it is certainly a process that enables scientific development, as it has for centuries. If we start to create more and more rules that researchers need to follow, it will become even harder to make scientific research and most institutions won't have resources to continue.


Maybe not rules but expectations culturally among researchers. If you want your work to actually get used, make it usable. I guess it's different in different fields. I'm most familiar with CS and machine learning. There, it's more and more a community expectation to have access to the code or be very skeptical of the numerical results. It doesn't mean the other parts of the paper are also discounted, so if there are genius ideas in the explanation, it is still valuable. But people can do any number of things to squeeze out a benchmark advantage of 1% and I only trust that if I have code (sure, GPU ML code is not fully bitwise reproducible as of now, but it's being worked on. It's becoming technically possible but not everyone has heard about it or understands why). Even without bitwise repeatability, I want the code and instructions how to run the published experiments. I don't believe benchmark claims otherwise. Simply too much noise is being pumped into the literature. To much data torturing for career reasons, for visa reasons, for reasons of "but I must publish to not get fired and finish my PhD, and the reviewers will only let me publish if I beat the benchmark, so I'll do my analysis over and over until I get 1% better".

The culture needs to improve. Benchmarks shouldn't be everything but reviewers are inexperienced. Many reasons in many parts of the system. In other fields the issues are different. They are more about having to obtain statistically significant results or you fail your career.

It's a good thing that people are waking up to this. It's not about punishing the individual researchers, it's about our collective intellectual immune system. We can't digest this firehouse of papers if it's poisoned to such an extent. It's not about charity or burden. It's about being skeptical when we know we're dealing with unreliable data. Science is a massive endeavor with massive quality differences between works and researchers and groups. Blind trust is no longer enough if you care about keeping your beliefs curated.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: