The researcher's job is to properly do an experiment and document it in a paper....

bonoboTP · on Jan 7, 2021

"These standards" are pretty low. Currently it's a free-for-all chaos. Theoretically papers are reproducible from the documentation found in the paper but that is a lie. It is never reproducible just from the paper. Lots of stuff is done in the background that is not known to the reader. For all we know, they can even tweak their numbers to be 2% better and if someone can't get the results of the paper from the released code, the authors can just ignore it or say, the problem is not on their side, or that the paper numbers were generated with a slightly different code than the released version etc. I've seen this many times on Github, issues getting closed or deleted without comment etc. There is zero accountability.

It's slowly changing though but many people are grinding their teeth, because they can't torture the data as much if things are out in the open.

coliveira · on Jan 7, 2021

> "These standards" are pretty low. Currently it's a free-for-all chaos.

I disagree. It is not perfect, but it is certainly a process that enables scientific development, as it has for centuries. If we start to create more and more rules that researchers need to follow, it will become even harder to make scientific research and most institutions won't have resources to continue.

bonoboTP · on Jan 7, 2021

Maybe not rules but expectations culturally among researchers. If you want your work to actually get used, make it usable. I guess it's different in different fields. I'm most familiar with CS and machine learning. There, it's more and more a community expectation to have access to the code or be very skeptical of the numerical results. It doesn't mean the other parts of the paper are also discounted, so if there are genius ideas in the explanation, it is still valuable. But people can do any number of things to squeeze out a benchmark advantage of 1% and I only trust that if I have code (sure, GPU ML code is not fully bitwise reproducible as of now, but it's being worked on. It's becoming technically possible but not everyone has heard about it or understands why). Even without bitwise repeatability, I want the code and instructions how to run the published experiments. I don't believe benchmark claims otherwise. Simply too much noise is being pumped into the literature. To much data torturing for career reasons, for visa reasons, for reasons of "but I must publish to not get fired and finish my PhD, and the reviewers will only let me publish if I beat the benchmark, so I'll do my analysis over and over until I get 1% better".

The culture needs to improve. Benchmarks shouldn't be everything but reviewers are inexperienced. Many reasons in many parts of the system. In other fields the issues are different. They are more about having to obtain statistically significant results or you fail your career.

It's a good thing that people are waking up to this. It's not about punishing the individual researchers, it's about our collective intellectual immune system. We can't digest this firehouse of papers if it's poisoned to such an extent. It's not about charity or burden. It's about being skeptical when we know we're dealing with unreliable data. Science is a massive endeavor with massive quality differences between works and researchers and groups. Blind trust is no longer enough if you care about keeping your beliefs curated.