How to support open-source software and stay sane

a3_nm · on July 1, 2019

Argh, this article is going to give researchers the misleading impression that releasing code as open-source is complicated, that you need to maintain it, ask yourself difficult questions, be an expert in programming, find funding to keep it around, etc.

But in my field, in most cases the source code is never released at all. That's a far bigger problem that not having support to use it.

So fellow academics, please don't use this article as an excuse to not release your code. When in doubt, just push the thing to Gitlab as is, add a README that says "This is research code for paper X and it is unmaintained.", and disappear. It's not ideal, it's not the best way to do science -- but it's much better than not releasing your code.

Related: the CRAPL http://matt.might.net/articles/crapl/

_dps · on July 1, 2019

> But in my field, in most cases the source code is never released at all. That's a far bigger problem that not having support to use it.

I agree but I think it's good for people to know that almost any moderately successful open project will attract whiners, complainers, entitled people, and in some cases outright abuse.

From the perspective of society, having no open code is worse than having open code that's unmaintained. We're agreed here.

From an individual contributor's perspective, opening yourself up to varying forms of whining and abuse "for the social good" (and not, say, for your tenure or publication count or whatever a researcher cares about in the moment) is a bigger problem than just sitting quietly on stuff you don't want to become a drain on your life.

Bakary · on July 2, 2019

What exactly are the whiners going to do if you just ignore them?

krageon · on July 2, 2019

If you use an online presence that is traceable across the internet, some subset of them will bother you everywhere or decide to make your life miserable when you ignore them. As such, anything that can remotely be seen as connected to you with any longevity is best put online without your name and with a throwaway nickname.

Given that precaution, nothing! Sadly a lot of people don't take this precaution.

zrobotics · on July 2, 2019

If the code is released anonymously, what good is it though (in the context of research code)? Ideally it should be connected to the paper to allow for replication, and what academic would want to release a paper anonymously?

krageon · on July 3, 2019

I think we agree then that releasing code connected to a paper is something I wouldn't recommend anyone actually do, even if I do like it when it does happen :)

_dps · on July 2, 2019

Leaving aside the outright abusive, probably nothing. But how excited would you be to receive in your inbox "Your [beloved project] didn't solve my pet problem the way I wanted! This software sucks and the devs are lazy bastards who don't listen to the community". Then imagine getting one of these a month, a week, even a day if your project is very successful.

It's of course nothing like actual abuse or threats of violence but it's an emotional toll that I certainly wouldn't want to pay. I have better things to do with my life.

exolymph · on July 2, 2019

Publish the repo and then immediately archive it. The code is out there but people won't be able to leave issues, comments, etc.

slavik81 · on July 2, 2019

Note that the CRAPL license is not an open source license. It is extremely restrictive about what you can do with the code. It doesn't even give you permission to share the code with anyone else.

It's also poor for people distributing code under the license. It jokes about how the program was constructed without thought or design, but that's the very basis of the program's copyright protection. If you ever try to enforce the license, you might regret having to spend time arguing the implications of that joke in court.

Nobody should ever use CRAPL. Use the MIT or GPL licenses and set expectations for maintenance or support to zero in the README as you said.

Previous discussion: https://news.ycombinator.com/item?id=9670497

boron1006 · on July 1, 2019

Great article, and fantastic to see a spotlight on an issue that I've thought a lot about.

The sad part is that to a lot of scientists and researchers, software/software engineers isn't something worth paying for. It's not uncommon to see "programmer" jobs that are looking for 3+ years of experience that offer <$15 dollars an hour in the US. Sometimes they're "volunteer intern" positions. Of course the people who end up filling these positions aren't usually actual developers, so the software gets built poorly, eventually gets scrapped, and the cycle continues.

Management also hasn't really evolved past the 90's. Non-technical scientists often want 100% of control and to make each decision, but don't want to spend any time on it. This means developers often have little to no specs to work with, but spend all of their time guessing about what the scientists want, and having to go back and fix everything after.

>“That’s really the tragedy of the funding agencies in general,” says Carpenter. “They’ll fund 50 different groups to make 50 different algorithms, but they won’t pay for one software engineer.”

This is the crux of my frustration. It's not even 50 different algorithms often. A lot of the time, 50 different research groups will be working on very similar programs, and none will be able to deliver a working version.

Though the article mentions that research funding does exist, clicking on one of those funding pages and looking through their examples reveals that only ~1/10 of their websites are actually still active, and they aren't old sites. Again this goes back to the whole "scientists don't value software thing". I've seen scientists happily sign off on spending $20,000+ on hardware components that would usually cost <$100 to make, but balk at contributing $50 yearly to support open source.

I got lucky that I managed to find a place where I get paid fairly, and my boss is actually technical and can manage tech projects well, but these places are few and far between.

opportune · on July 1, 2019

A lot of the points you bring up re related to cost. Here's the thing about cost, let's say it costs $100k/year to hire a good software engineer capable of writing scientific code (able to program and test complex algorithms, write HPC code, turn whitepapers into code), which might even be an underestimate depending on how benefits are paid out and the area. You can also fund 3 more grad students for that kind of money. The grad students will directly convert a PI's money into authorships while the software engineer's contribution will be only indirect, and likely take years to pay off.

Plus, with only a single software engineer, there's a good chance you get unlucky and end up with someone clueless/lazy. You would probably need 3-4 software engineers to make a functioning team with best practices and hedge your bets against accidentally hiring someone who sucks. So now we're talking 10+ grad students.

Open source software is a bit different because many labs can band together to fund things they find useful. But again there are still issues with cost-effectiveness. I'm guessing most lab contributors to OSS would want some sort of quid-pro-quo which may not be realistic for all OSS projects. And by funding OSS you are also funding competing labs' abilities to use the same features you use, which is good for science in general but not good for people's careers sometimes

boron1006 · on July 1, 2019

You're right, and more generally all of the issues are related to the fact that scientific incentives don't typically align with good development. At the end of the day, over a period of 3+ years, I'd rather have the results of 3-4 software engineers compared to 10-12 grad students. However, for <3 years, I'd choose the grad students. Pretty much every incentive in science (e.g. grants, awards) prioritizes being prolific over a short period of time.

nextos · on July 1, 2019

Here in Oxbridge there's pretty generous funding for hiring software engineers to maintain some biotech software, and much lower pressure to publish. I know many open positions. And they tend to fund you for very long time. I know people working on said positions for more than a decade. Sometimes 20 years.

The problem is more pay gap with industry. Even though they tend to pay well by academic standards (e.g. no PhD required, yet pay is much higher than a senior postdoc), the salary is still way below what industry in London or Oxbridge offers.

Furthermore, you tend to be surrounded by non-technical people which may be tough in the long term. Nobody appreciates what you do. Not even your boss, who may know zero about computers.

The bottom line is that positions end up being vacant for long time and tend to be filled by biologists with a bit of coding experience, underqualified IT people or, rarely, really competent individuals that want a break from industry.

wolfgke · on July 1, 2019

> The problem is more pay gap with industry. Even though they tend to pay well by academic standards (e.g. no PhD required, yet pay is much higher than a senior postdoc), the salary is still way below what industry in London or Oxbridge offers.

I know sufficiently many good programmers who would love to do scientific programming (because they love science) and would immediately accept less pay.

The problem, in my opinion, rather lies in the non-monetary work conditions. For example, in Germany it is nearly impossible to get permanent employment contract when working at a scientific institute and doing something remotely related to scientific work. Even worse: you are not even allowed to work more than 6+6 years (before and after doctorate) in a fixed-term employment position at a scientific institute If this time is over, you are not even allowed to take any non-permanent contract at a scientific institute (the infamous/insane Wissenschaftszeitvertragsgesetz (WissZeitVG)).

No programmer (even if (s)he has a great passion for science) will be willing to work under such extraordinarily bad conditions.

nextos · on July 1, 2019

That sounds bad indeed.

Here it is a bit better. Most positions I know of are de facto permanent. On paper they are not, as most labs go through 5-year funding cycles. So there's a tiny chance of loosing funding. It's quite rare for big labs.

Besides, places like MRC have created permanent research assistant positions. Which are actually permanent and put zero pressure on publications.

As you say, connecting with interested and talented programmers is another problem. I feel that Nature Jobs postings, which were already a big leap forward for rusty uni administrators, are not good enough.

bluGill · on July 1, 2019

Why hire an engineer? Why not walk over to the CS (might have a different name) and talk to a professor there. They can set you up with plenty of undergrads who need this experience, and they should be able to guide them into something that is maintainable long term.

Note that I said should there. How to write maintainable programs seems to be lacking in research area.

gaze · on July 1, 2019

Because undergrads are on average very bad programmers. They're both enthusiastic and don't know what they don't know. Good programmers are above all very experienced and disciplined and clear thinkers. Undergrads are stressed by their remaining workload so they have split responsibilities, and they just can't put the necessary time in.

opportune · on July 1, 2019

I was such a CS undergrad working in a lab once upon a time. I don't think you really want that because the undergrad will probably only be working for like 4-15hr/week potentially for only a single semester. For a summer position, sure it's 40hr/week but still only for about 10-15 weeks.

And still, you're getting a generic CS undergrad's caliber of work and responsibility which I would say on average is not great. They might not be as familiar with version control, best practices, etc. and could just end up writing code just as bad as the scientists.

I think if hiring a team you would need at least one somewhat experienced full-time software engineer to act as team lead/PM for the other developers, whether fulltime or students.

Master_Odin · on July 1, 2019

Yeah, picking up undergrads (or even grad students) from the CS department is not a surefire way to end up code that follows best practices, is well maintained/documented, etc. and I'd probably argue if your team leader doesn't have that experience, you're more likely to end up with something that doesn't follow great practices (especially with regards to any sort of test suite).

bluGill · on July 2, 2019

All the replys about undergrad quality are correct. I stand by my statement though: we need to figure out how to solve this problem and research is sorely lacking.

boron1006 · on July 1, 2019

This would be a good solution, but from what I've seen with psychologists and statisticians, it's unlikely to happen for reasons I don't fully understand. Another thing is that undergraduates often learn by adopting the norms of the institution that they're in (e.g. using version control, linters, etc) but when they're brought in as the technical person, they don't have that opportunity to improve (This is my personal anecdote as that cs undergrad at one time).

giobox · on July 1, 2019

I’ve been that CS undergrad in the past who aided another department’s research, and the quality of my work back then was every bit as bad as you might expect. Of course I, like many of my CS peers, arrogantly thought I was doing great work at the time...

For sure, talented undergrad CS developers exist, but in my experience there are far fewer of them than many people might think. Experience counts for a lot, especially when trying to deliver even average quality software work.

As for something made by undergrads that is “maintainable” to a standard broadly comparable with an experienced developer? Maybe if you get lucky...

krageon · on July 2, 2019

> plenty of undergrads who need this experience

The fact that they need the experience should already tell you that they're not yet competent at what they are needed for. They should be put on low-impact projects that don't matter, or low-impact projects that do matter with solid mentoring. They should not be made to write software for something that is high-impact for you with little to no mentoring.

bp0017 · on July 1, 2019

Incidentally, this is exactly what my lab does

nwallin · on July 1, 2019

The thing that terrifies me about the state of affairs is what happens when the software gives the wrong result because it's poorly written? If a scientists has input B and expects output X, and writes code that accepts B, and it happens to output X, chances are they'll write the paper and send it off for publication. Even if there's a bug in the code, and the real output should have been Y. I genuinely believe that this happens more often than most of us would find acceptable. Hell, there have probably been instances of the code outputting Y and the scientist scratches their head like, "Hmmm, that's not right, something's wrong," and will massage the code until it eventually outputs X, and then they'll say, "Fixed it!" and move on.

I know for a fact it's happened at least once, and led to a scientific controversy that lasted for decades. Unfortunately, I don't remember the specifics; if someone recognizes my vague description please step in with a citation. But one group of scientists published a paper saying a certain dynamic system behaved in a certain way, and a second group published a paper saying it behaved in a different way, and the two results were completely incompatible with each other. Significant public disagreement ensued. One group published their code, and the other group attacked it saying it was poorly written, etc. The second group did not release their code. Decades later, some other scientist at the second institution released the code, and after a code review, it was found that the data was incorrectly initialized. They needed to initialize the particles with "random" initial velocity vectors, but the scientist who wrote the code didn't know how to do it correctly, and wrote an ad-hoc algorithm that gave the initial velocity vectors significant bias along an axis. But the paper was already written, peer reviewed, published, and cited, so even though the paper was wrong, the result was still accepted by (half of) the scientific community. AFAIK the paper was never retracted.

roel_v · on July 1, 2019

I have seen and fixed code that was first written as a prototype, then used as is in a follow on project and so on, until it turned out to be used for deciding on whether or not to grand certain subsidies. Except whoops in one spot it interpreted kilometers as meters without dividing by 1000. Along with literally dozens of other outright bugs. But nobody cares, really.

chrisseaton · on July 1, 2019

> Non-technical scientists

How can you have a ‘non-technical scientist’? All science is inherently technical.

Are you using ‘technical’ as a short-hand for ‘can program’? Stop doing that - programming is not the only technology.

boron1006 · on July 1, 2019

I'm using technical in a broad sense. One effect of scientific research becoming bigger is that roles are becoming more specialized. Often the people leading the research have primarily spend time writing grants or papers, and less senior professors and post-docs will carry out the actual tasks. When I first started I was a bit shocked at how many of the big names knew basically nothing about the process of how their own research is carried out.

The effects are more pronounced depending on what field we're talking about. For example, in physics, I'd imagine most people have at least the fundamentals of programming down, even if software design may be lacking. In the field I work in (Neuroimaging), a lot of PI's are doctors or neuropsychologists, and might barely even know how to use a computer.

scarejunba · on July 1, 2019

What an incredibly pointless thing to argue about. I think we all knew what he meant and, this being a specialist forum, we benefit from using shorthand in speech that is a lossy representation of ideas to those not initiated.

I hate to have to create this conversation fork but I really wish people wouldn't make comments like this. They're so low signal.

chrisseaton · on July 1, 2019

I disagree - I think reducing anyone who doesn’t happen to program to being ‘non-technical’, with the implication that they don't know what they're talking about, is insidious.

scarejunba · on July 1, 2019

It's clear he means "Scientists who don't want too much input into the engineering decisions". You're reading an implication of competency where there is none.

ylem · on July 1, 2019

Two other points to consider--equipment is a one-off expense and staffing is a continuous expense. The other is that the pots of money for equipment and staff may be different....

boron1006 · on July 1, 2019

Completely right, I've been trying to hype up the "Github sponsorships" program as a way of changing the thinking around software (e.g. tacking OSS onto grants as required equipment), but haven't found much support.

kochthesecond · on July 1, 2019

When said scientists want to build some new equipment, they get involved and they use professionals. When said scientists need to build their new lab, they hire professionals. When they need code for their research projects they?

dataflow · on July 1, 2019

> The sad part is that to a lot of scientists and researchers, software/software engineers isn't something worth paying for.

I have to ask how you came to this conclusion. Did you have anecdotes from them supporting this or did you just conclude "they are not paying for it therefore software is not worth paying for to them"?

ylem · on July 1, 2019

I think the original poster is applying the "cheap talk" heuristic. People may say they value things, but we see what's really valued by what they pay for (given the resources they have)...

dddddaviddddd · on July 1, 2019

Maintenance is a challenge anywhere where software is developed in-house without a dedicated development team. Development is often lead by one person and becomes very difficult when they depart. It seems like all the regular maintenance challenges are present in these situations, just exacerbated. Not sure what organizations that aren't software-focused can do to improve their situation in this regard.

PascLeRasc · on July 1, 2019

One thing we can do as users is champion the idea that open-source authors don't owe us anything. Having support or getting help with problems is great, but the author's already done us a huge favor by writing the software we needed in the first place, and they aren't required to go beyond that or do anything specifically for an individual.

TallGuyShort · on July 1, 2019

Yeah this is what I expected the article to be about: what drives me insane as an open-source developer is how a paying customer who has an outage at the worst possible time will be so much more polite and grateful for the help I'm contractually obligated to give them, than so many random people on forums are about the product not having a feature they want.

It makes me want to give my customer's the source and tell them they can do whatever they want, and then ignore the rest of the community except for high-quality pull requests.

crispyambulance · on July 1, 2019

Mostly agree, however, at some point the author's DO NEED to do something beyond creating the thing or else face the extinction of that piece of software.

I think most people who have created something will generously bend over backwards to help individuals in the early stages of it's lifecycle. You can see that all the time on github.

The problems come when the project takes off to the point where there isn't enough support for the number of people using it BUT the software isn't mature/popular/fit-enough to be "under the wing" of a larger organization who can afford to pay for it's maintenance and evolution.

Is there a way to bridge the gap between author's-generosity-support and corporate/organizational stewardship? We do have the social networks in place to allow that, they're just focused on different objectives.

SpaceManNabs · on July 1, 2019

Wasn't there a thing recently were an author just gave away one of his node.js libraries, and then it was used maliciously by the requester to attempt to hijack bitcoin wallets?

Found it: https://arstechnica.com/information-technology/2018/11/hacke...

I don't blame anyone in this scenario because the culture of open source projects and their interplay with enterprise encourages it

devxpy · on July 1, 2019

Exactly why the archive button exists :)

ddavis · on July 1, 2019

I think an answer to your final comment is quite simple: to invest in teaching proper software engineering practices, especially if it's not your focus. Get one or two people (potentially outside of the group/collaboration) that are experts and teach the group. I can say in high energy physics the environment is very much moving in the right direction. My collaboration has a dedicated tutorial three times a year which includes tutorials for things such as git and CMake (an overview of the concepts of version control and build systems are introduced as well, along with the definition of what a software release is). Just a few years ago this didn't exist; if you wanted to be proficient with these tools or understand the lingo you had to be self taught and a lot of people don't have the time to do that, so when they had to do it, it's like pulling teeth for a lot of people. Spending 2 to 3 days of a week 3 times a year is not a super serious commitment, and the material from the previous tutorial is always available for the next (always with some minor improvements/fixes). It gets people to a productive state a lot faster than just giving them a problem with our massive software stack and saying "oh and if you don't know git google it." I strongly believe all research groups that use software need well defined teaching material for how to be a productive user of and contributor to the local software stack. It's not a hard problem to solve and helps eliminate what are actually fake hard problems.

kkylin · on July 1, 2019

As an academic, I think that's a great first step. But there are many structural issues beyond this: advisors and students both under pressure to do whatever it takes to get the paper out and move on to the next project, students needing / wanting to just graduate and move on, usual yard sticks of academic achievement behind in recognizing software as legitimate product of research, etc (). On top of which, lots of different kinds of software is developed in research institutions, most of which are just rapid prototypes but some do get used by many people everyday, and it's not clear one process fits all.
If anyone out there has suggestions of useful resources, I'm all ears!

() Those of us who care about software do try to work on all these issues, but progress is slow.

0xffff2 · on July 1, 2019

I'm not sure how well this approach works on average, but I haven't had a great experience. I'm a software engineer supporting a research group that's mostly CS PhD's. I take any chance I get to teach good software engineering practices, but they mostly just don't care.

ddavis · on July 1, 2019

Yeah I hear you and empathize with this. That is especially difficult to deal with, but I think the model I describe with the three-times-a-year dedicated event helps, because people can really direct their focus for those few days and it's not random factoids as they come up.

wlesieutre · on July 1, 2019

Agreed, I've done some automation work at my job for stuff that has exploded in volume over the last few years and wasn't feasible to keep doing by copy/pasting stuff through Excel anymore.

It's in Python. I've avoided any external dependencies, kept inputs to CSV files that can be made from existing Excel sheets, and the code is fairly well commented.

But there used to be two people here who've written at least a line of python in their lives. Now it's just me, and if I leave I have no illusions that it'll be maintained.

Best thing to do is write instructions for whoever will need to run it, and they can hope that they never need it to do anything new.

avgDev · on July 1, 2019

I work at a place almost like that and I know the solution but nobody wants to pay for it.

We develop software in house. I am the single dev on staff, we have a few contractors that we use for legacy ERP system programming/maint./modifications. I work with people who have CS degrees, however, it is still hard for them to understand why I'm spending time on layered architecture instead of using some basic OOP. Thankfully, they trust me and allow me to do what is required.

The BEST solution I see is having more than one full stack dev but then you are paying 2x more. Also, have some kind of standard and review any outsourced work. I picked up a legacy app after an outside contractor and it has been a disaster to work with.

The source code provided was out of date, he could not produce the source code that was running in production. No naming conventions were used, literally everything was just generic names like command1, textbox1, and so on. This could have been easily caught if any competent junior looked at the code. Some methods with tens of if statements, methods that are 1k+ lines long, almost zero OOP.

If a company does have a single dev and cannot afford another they really need to stress maintenance and verify in some way that the dev is capable of producing a maintainable project. Therefore, they maybe should hire someone to help with the hiring process but hiring SEs is difficult even when experienced people are doing the hiring.

johannes1234321 · on July 1, 2019

In Academia lots of this software is built by doctorate or postdoc students with limited contracts - they are gone after a short while. In many other places the people writing software outside of IT departments are typically staying longer with the company.

Question now is: which is better? In one case there is a single "God" in the other it's passed on for generations, while everybody mostly cares about their research and not long term maintainability.

ylem · on July 1, 2019

I think part of it is funding--but part of it is also recognition. In a number of fields, the development of scientific software receives little recognition compared to the science--even if the impact is large in terms of the discoveries it enables. So, for say tenure or career advancement, it might be "nice" for you to do it, but nowhere near as important as publishing high impact papers (at least in many fields). Especially if a graduate student or postdoc commits too much time to it instead of research results, they risk not being able to continue in their field (though they probably have more options if they decide to leave science to become software developers).

xvilka · on July 1, 2019

Speaking about biomedical software - I suggested[1] to make a Julia flavor of Biostar Handbook[2], an amazing introduction into the field of bioinformatics and genomics from the Biostars[3] Q&A site authors. Porting algorithms to Julia will greatly improve the speed and maintainability of corresponding programs.

[1] https://discourse.julialang.org/t/biostar-handbook-computati...

[2] https://www.biostarhandbook.com/

[3] https://www.biostars.org/

dekhn · on July 1, 2019

Great article, thrilled to see PIs coming out and saying explicitly that the funding agencies are making a huge mistake funding discovery-driven science at the cost of long-term production work.

zdw · on July 1, 2019

Determining worthiness of projects to fund is a really difficult battle. Do you go based on popularity? Importance? If something is worthy, how much is it funded? For how long? Who is paid to do the work?

Even other projects that tried to address this like the Core Infrastructure Initiative seem to have unintended consequences. For example OpenSSL got CII funding, then used some portion of that to relicense as Apache 2 which breaks compat with the more free LibreSSL fork, weakening the overall community.

jgamman · on July 1, 2019

if people paid the price of a coffee for good software, post-docs could attach themselves to a research group and fund themselves by maintaining quality software products - freelance scientist FTW.

the problem here is that lots of people want _other_ people to work for free. if you're not being paid, it's a hobby and you don't owe anyone anything. if the science system relies on free labour and refuses to support it, that's a very different conversation and the results are predictable.

FpUser · on July 1, 2019

"...Scientists writing open-source software often lack formal training in software engineering, which means that they might never have learnt best practices for code documentation and testing..."

This is the most ridiculous statement. I had chance to work in either environment. Having that experience for the quality of the end result I will take scientist (preferably physicist or mathematician) with self taught software development skills over formally trained agile guru any time.

Of course there are exceptions but ...

oneepic · on July 1, 2019

I have the exact opposite opinion, plus my cousin is a robotics/ML professor who has to deal with this same issue in his research group.

Indeed, tons of scientists really have no idea about style, testing, etc. They're happy to just write an imperative C/C++/Python program with no docs whatsoever, run it, and be done with it.

amyjess · on July 1, 2019

My experience with academic code is that it's all ad-hoc software that's written to solve a single problem for a single paper. Once the final draft of the paper is peer-reviewed and published, there is no longer any reason to ever touch that code again.

From a practical standpoint, why invest in future-proofing your code if it's going to be thrown away once the current paper is finished? No need to make the code readable for the next author, no need to document it, no need to make your code non-monolithic because you're never going to build on top of it later.

On top of this, they were never educated and trained as software engineers. If you're lucky, they may have a pure CS background (and some of the worst code I've seen has been written by academic CS people who are explicitly not software engineers), but most likely they come from various academic disciplines that don't teach how to write code.

I used to work at an academia-focused NLP company, and while they did have some well-structured long-lived projects that we used across several projects, there were also large piles of code that you can tell were just intended to be used once and then forgotten about.

This also reminds me of the way Japanese console developers, such as Squaresoft, used to treat their source code in the '90s. Once the game shipped, they would just wipe the source code from their hard drives and their backups to save space. Hey, it's a pre-2006 console game, it's never going to get patched after release, and this was long before the nostalgia bug bit and publishers realized there was money in porting old games to new platforms. Hey, if you believe that code is only ever going to be used once, you are going to treat it as throwaway code, and that includes actually throwing it away at the end. As a result, many later ports of '90s console games were remade from scratch, consist of old ROMs running in an emulator (sometimes romhacked if the game had never been translated before, like Trials of Mana), rebuilt from a third-party port of the game, or if they're really lucky rescued from early beta code stored on an old computer they forgot to wipe (this is how the FF7 PC version got made in fact). There was a Twitter thread recently, and it's absolutely fascinating.

fghtr · on July 1, 2019

Why would scientists write good code with documentation if it is not recognized for their career?

FpUser · on July 1, 2019

I was talking about scientists who switched from science to writing software as a product for whatever reason. They surely produce good docs (properly documenting their work is the very basis of being scientist).

As for writing imperative code in C/C++/Whatever other language they seem to choose: nothing is wrong with that.

sbov · on July 1, 2019

You're basically saying scientists can become good programmers if they become professional programmers. Of course they can be, and I doubt anyone would argue that. Some of the best programmers I know have zero college education at all, so of course it's possible for mathematicians and scientists to do the same.

But that isn't anywhere near the point of the article. It's talking about scientists that did NOT switch to writing software professionally, but continue to write it merely as a tool to accomplish their primary scientific goals.

FpUser · on July 1, 2019

"...It's talking about scientists that did NOT switch to writing software as a product, but as merely a way to accomplish their primary scientific goals. ..."

Sure, for this they would do what is just enough to solve their specific problem. Just as ANY sound business would do.

kingbirdy · on July 1, 2019

But the point of the article is that this isn't sustainable. If the person who wrote it leaves, or if they need to do a similar task later, or another lab is doing similar work, all the effort will be repeated. It would make a lot more sense for the community to invest in more open source tools that generalize to cover more research domains.

opportune · on July 1, 2019

Writing medium-sized amounts of imperative code is wrong if it is going to be shared with any other person than a single developer writing it. And even if only a single person will ever see it, you are still better off writing it in components once the project reaches a certain size.

jlarocco · on July 1, 2019

I don't agree.

There's nothing inherently wrong with writing imperative code, and being imperative doesn't imply the code isn't broken up into components or that it's necessarily difficult to maintain.

FpUser · on July 1, 2019

Imperative programming and components are 2 different things. The latter could be implemented with numerous approaches including imperative programming

opportune · on July 1, 2019

That's true, but when someone mentions writing something in an "imperative style" I think it's common for that to actually just mean one huge file that executes sequentially which is not amenable to testing, having someone else working on a different part of the code without running into lots of merge conflicts, makes it very hard to refactor, etc.

veddox · on July 1, 2019

> Scientists writing open-source software often lack formal training in software engineering

I'm at an institute of computational biology and that is precisely the problem we are tackling right now. We have a lot of clever people doing a lot of clever things, but a large part of effective software development is developing some good habits that those without training have often never heard of (e.g. how to write clean code, defensive programming, etc.)

As one of the more experienced "developers" (a.k.a. self-taught programmers) on our team, I've been writing up what I consider to be the basic principles of good software development for computational scientists. (https://terranostra.one/posts/Principles-of-Software-Develop..., if you're interested.) Later this week, we're going to have a group meeting to discuss what would be a good method of teaching these principles to new members of the institute. Excited to see what will come of that discussion ;-)

la_barba · on July 1, 2019

I don't think its ridiculous to expect a person trained in a specialty profession to be better than an untrained one. But yeah, the difference is that a purely s/w oriented training teaches you nothing about the domain and the domain-specific challenges. So maybe having that domain expertise is a net benefit even if they are not the best at programming...

FpUser · on July 1, 2019

Again, from my experience university educated scientists are better educated and as a result could better understand the actual problems needed to be solved and find very good and practical ways to accomplish it.

As for what goes for "formal training" in modern colleges. Well I better not go there.

la_barba · on July 1, 2019

Oh, I'm not disparaging scientists! I'm just not seein how the claim was ridiculous.