Great article, and fantastic to see a spotlight on an issue that I've thought a ...

opportune · on July 1, 2019

A lot of the points you bring up re related to cost. Here's the thing about cost, let's say it costs $100k/year to hire a good software engineer capable of writing scientific code (able to program and test complex algorithms, write HPC code, turn whitepapers into code), which might even be an underestimate depending on how benefits are paid out and the area. You can also fund 3 more grad students for that kind of money. The grad students will directly convert a PI's money into authorships while the software engineer's contribution will be only indirect, and likely take years to pay off.

Plus, with only a single software engineer, there's a good chance you get unlucky and end up with someone clueless/lazy. You would probably need 3-4 software engineers to make a functioning team with best practices and hedge your bets against accidentally hiring someone who sucks. So now we're talking 10+ grad students.

Open source software is a bit different because many labs can band together to fund things they find useful. But again there are still issues with cost-effectiveness. I'm guessing most lab contributors to OSS would want some sort of quid-pro-quo which may not be realistic for all OSS projects. And by funding OSS you are also funding competing labs' abilities to use the same features you use, which is good for science in general but not good for people's careers sometimes

boron1006 · on July 1, 2019

You're right, and more generally all of the issues are related to the fact that scientific incentives don't typically align with good development. At the end of the day, over a period of 3+ years, I'd rather have the results of 3-4 software engineers compared to 10-12 grad students. However, for <3 years, I'd choose the grad students. Pretty much every incentive in science (e.g. grants, awards) prioritizes being prolific over a short period of time.

nextos · on July 1, 2019

Here in Oxbridge there's pretty generous funding for hiring software engineers to maintain some biotech software, and much lower pressure to publish. I know many open positions. And they tend to fund you for very long time. I know people working on said positions for more than a decade. Sometimes 20 years.

The problem is more pay gap with industry. Even though they tend to pay well by academic standards (e.g. no PhD required, yet pay is much higher than a senior postdoc), the salary is still way below what industry in London or Oxbridge offers.

Furthermore, you tend to be surrounded by non-technical people which may be tough in the long term. Nobody appreciates what you do. Not even your boss, who may know zero about computers.

The bottom line is that positions end up being vacant for long time and tend to be filled by biologists with a bit of coding experience, underqualified IT people or, rarely, really competent individuals that want a break from industry.

wolfgke · on July 1, 2019

> The problem is more pay gap with industry. Even though they tend to pay well by academic standards (e.g. no PhD required, yet pay is much higher than a senior postdoc), the salary is still way below what industry in London or Oxbridge offers.

I know sufficiently many good programmers who would love to do scientific programming (because they love science) and would immediately accept less pay.

The problem, in my opinion, rather lies in the non-monetary work conditions. For example, in Germany it is nearly impossible to get permanent employment contract when working at a scientific institute and doing something remotely related to scientific work. Even worse: you are not even allowed to work more than 6+6 years (before and after doctorate) in a fixed-term employment position at a scientific institute If this time is over, you are not even allowed to take any non-permanent contract at a scientific institute (the infamous/insane Wissenschaftszeitvertragsgesetz (WissZeitVG)).

No programmer (even if (s)he has a great passion for science) will be willing to work under such extraordinarily bad conditions.

nextos · on July 1, 2019

That sounds bad indeed.

Here it is a bit better. Most positions I know of are de facto permanent. On paper they are not, as most labs go through 5-year funding cycles. So there's a tiny chance of loosing funding. It's quite rare for big labs.

Besides, places like MRC have created permanent research assistant positions. Which are actually permanent and put zero pressure on publications.

As you say, connecting with interested and talented programmers is another problem. I feel that Nature Jobs postings, which were already a big leap forward for rusty uni administrators, are not good enough.

bluGill · on July 1, 2019

Why hire an engineer? Why not walk over to the CS (might have a different name) and talk to a professor there. They can set you up with plenty of undergrads who need this experience, and they should be able to guide them into something that is maintainable long term.

Note that I said should there. How to write maintainable programs seems to be lacking in research area.

gaze · on July 1, 2019

Because undergrads are on average very bad programmers. They're both enthusiastic and don't know what they don't know. Good programmers are above all very experienced and disciplined and clear thinkers. Undergrads are stressed by their remaining workload so they have split responsibilities, and they just can't put the necessary time in.

opportune · on July 1, 2019

I was such a CS undergrad working in a lab once upon a time. I don't think you really want that because the undergrad will probably only be working for like 4-15hr/week potentially for only a single semester. For a summer position, sure it's 40hr/week but still only for about 10-15 weeks.

And still, you're getting a generic CS undergrad's caliber of work and responsibility which I would say on average is not great. They might not be as familiar with version control, best practices, etc. and could just end up writing code just as bad as the scientists.

I think if hiring a team you would need at least one somewhat experienced full-time software engineer to act as team lead/PM for the other developers, whether fulltime or students.

Master_Odin · on July 1, 2019

Yeah, picking up undergrads (or even grad students) from the CS department is not a surefire way to end up code that follows best practices, is well maintained/documented, etc. and I'd probably argue if your team leader doesn't have that experience, you're more likely to end up with something that doesn't follow great practices (especially with regards to any sort of test suite).

bluGill · on July 2, 2019

All the replys about undergrad quality are correct. I stand by my statement though: we need to figure out how to solve this problem and research is sorely lacking.

boron1006 · on July 1, 2019

This would be a good solution, but from what I've seen with psychologists and statisticians, it's unlikely to happen for reasons I don't fully understand. Another thing is that undergraduates often learn by adopting the norms of the institution that they're in (e.g. using version control, linters, etc) but when they're brought in as the technical person, they don't have that opportunity to improve (This is my personal anecdote as that cs undergrad at one time).

giobox · on July 1, 2019

I’ve been that CS undergrad in the past who aided another department’s research, and the quality of my work back then was every bit as bad as you might expect. Of course I, like many of my CS peers, arrogantly thought I was doing great work at the time...

For sure, talented undergrad CS developers exist, but in my experience there are far fewer of them than many people might think. Experience counts for a lot, especially when trying to deliver even average quality software work.

As for something made by undergrads that is “maintainable” to a standard broadly comparable with an experienced developer? Maybe if you get lucky...

krageon · on July 2, 2019

> plenty of undergrads who need this experience

The fact that they need the experience should already tell you that they're not yet competent at what they are needed for. They should be put on low-impact projects that don't matter, or low-impact projects that do matter with solid mentoring. They should not be made to write software for something that is high-impact for you with little to no mentoring.

bp0017 · on July 1, 2019

Incidentally, this is exactly what my lab does

nwallin · on July 1, 2019

The thing that terrifies me about the state of affairs is what happens when the software gives the wrong result because it's poorly written? If a scientists has input B and expects output X, and writes code that accepts B, and it happens to output X, chances are they'll write the paper and send it off for publication. Even if there's a bug in the code, and the real output should have been Y. I genuinely believe that this happens more often than most of us would find acceptable. Hell, there have probably been instances of the code outputting Y and the scientist scratches their head like, "Hmmm, that's not right, something's wrong," and will massage the code until it eventually outputs X, and then they'll say, "Fixed it!" and move on.

I know for a fact it's happened at least once, and led to a scientific controversy that lasted for decades. Unfortunately, I don't remember the specifics; if someone recognizes my vague description please step in with a citation. But one group of scientists published a paper saying a certain dynamic system behaved in a certain way, and a second group published a paper saying it behaved in a different way, and the two results were completely incompatible with each other. Significant public disagreement ensued. One group published their code, and the other group attacked it saying it was poorly written, etc. The second group did not release their code. Decades later, some other scientist at the second institution released the code, and after a code review, it was found that the data was incorrectly initialized. They needed to initialize the particles with "random" initial velocity vectors, but the scientist who wrote the code didn't know how to do it correctly, and wrote an ad-hoc algorithm that gave the initial velocity vectors significant bias along an axis. But the paper was already written, peer reviewed, published, and cited, so even though the paper was wrong, the result was still accepted by (half of) the scientific community. AFAIK the paper was never retracted.

roel_v · on July 1, 2019

I have seen and fixed code that was first written as a prototype, then used as is in a follow on project and so on, until it turned out to be used for deciding on whether or not to grand certain subsidies. Except whoops in one spot it interpreted kilometers as meters without dividing by 1000. Along with literally dozens of other outright bugs. But nobody cares, really.

chrisseaton · on July 1, 2019

> Non-technical scientists

How can you have a ‘non-technical scientist’? All science is inherently technical.

Are you using ‘technical’ as a short-hand for ‘can program’? Stop doing that - programming is not the only technology.

boron1006 · on July 1, 2019

I'm using technical in a broad sense. One effect of scientific research becoming bigger is that roles are becoming more specialized. Often the people leading the research have primarily spend time writing grants or papers, and less senior professors and post-docs will carry out the actual tasks. When I first started I was a bit shocked at how many of the big names knew basically nothing about the process of how their own research is carried out.

The effects are more pronounced depending on what field we're talking about. For example, in physics, I'd imagine most people have at least the fundamentals of programming down, even if software design may be lacking. In the field I work in (Neuroimaging), a lot of PI's are doctors or neuropsychologists, and might barely even know how to use a computer.

scarejunba · on July 1, 2019

What an incredibly pointless thing to argue about. I think we all knew what he meant and, this being a specialist forum, we benefit from using shorthand in speech that is a lossy representation of ideas to those not initiated.

I hate to have to create this conversation fork but I really wish people wouldn't make comments like this. They're so low signal.

chrisseaton · on July 1, 2019

I disagree - I think reducing anyone who doesn’t happen to program to being ‘non-technical’, with the implication that they don't know what they're talking about, is insidious.

scarejunba · on July 1, 2019

It's clear he means "Scientists who don't want too much input into the engineering decisions". You're reading an implication of competency where there is none.

ylem · on July 1, 2019

Two other points to consider--equipment is a one-off expense and staffing is a continuous expense. The other is that the pots of money for equipment and staff may be different....

boron1006 · on July 1, 2019

Completely right, I've been trying to hype up the "Github sponsorships" program as a way of changing the thinking around software (e.g. tacking OSS onto grants as required equipment), but haven't found much support.

kochthesecond · on July 1, 2019

When said scientists want to build some new equipment, they get involved and they use professionals. When said scientists need to build their new lab, they hire professionals. When they need code for their research projects they?

dataflow · on July 1, 2019

> The sad part is that to a lot of scientists and researchers, software/software engineers isn't something worth paying for.

I have to ask how you came to this conclusion. Did you have anecdotes from them supporting this or did you just conclude "they are not paying for it therefore software is not worth paying for to them"?

ylem · on July 1, 2019

I think the original poster is applying the "cheap talk" heuristic. People may say they value things, but we see what's really valued by what they pay for (given the resources they have)...