Sorry, but it's very easy to verify that these claims are crap by replicating th...

Palptine · on Jan 31, 2020

What's the chance of hitting all 4 of them though? This is a good step in verification but you are awfully too quick to reject their claim.

folli · on Jan 31, 2020

Read the paper, they actually only match 2 inserts, the other two inserts are modified by the authors in such a way that they are made to match (Table 1).

Both inserts 1 and 2 also match to Streptococcus phage, but a bacteriophage would of course not be such a bold claim as HIV matches are.

Also, be aware that because of the scientific interest in HIV, there are hundreds of HIV strains sequenced, a virus known for its mutation rate (especially in these two proteins gp120 and gag, as they are under pressure to mutate in order to evade the immunesystem). So in such a large library of protein sequences one is bound to find a match of a short 6 letter (amino acid) sequence. That's why E values exist to make a statement about the statistical significance.

folli · on Jan 31, 2020

For posteriority, here's the link to the Blast results for the second insert: https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Get&RID=39ACRKV...

im3w1l · on Feb 3, 2020

Hello posteriority here! I wanted to reference these comments in a discussion but the links expired :/

Palptine · on Jan 31, 2020

Large HIV database inflating matches is indeed a big concern. But dismissing one miss matches sounds arbitrary: these segments were not arbitrarily selected, but real insertions on tops of sars.

folli · on Jan 31, 2020

Again, take any random six letter amino acid sequence and chances are high that it matches to some HIV protein.

I just did the experiment with my first name that is coincidentally 6 letters long, and lo and behold: a match to HIV env protein!

Has my first name now been designed by a bioweapons facility?

AnimalMuppet · on Feb 1, 2020

> Has my first name now been designed by a bioweapons facility?

No, of course not. Your parents were designed by a bioweapons facility, so that they would choose that name.

;-)

folli · on Feb 1, 2020

You revealed my secret

nextos · on Jan 31, 2020

Large (but <100) evalues are sometimes considered as weak evidence of some evolutionary process if you are querying against a huge database with closely related sequences. However, given the length of their first 2 hits I'd tend to think this is by random chance. The last 2 are more interesting.

And the fact that there's no known CoV with any of these inserts is quite intriguing.

folli · on Jan 31, 2020

The last two (look at table 1) are interesting in such a way that its almost scientific misconduct akin to photoshopping a picture in a scientific paper. They blasted the inserts, but apparently couldn't find any matches to HIV, so they just changed them until they found something.

45ure · on Jan 31, 2020

>These guys that published such a paper are either completely clueless or nefarious in trying to stir up conspiracy theories.

In your opinion, do you think this kind of subterfuge could have been picked up by Dr. Eric Feigl-Ding or someone with a similar calibre, before broadcasting this preprint for wider consumption? Thanks.

folli · on Jan 31, 2020

I have not heard from this guy before (I work in microbial genomics, he seems to be in the field of health economics from a quick search), so I can't comment on that.

perseusprime11 · on Jan 31, 2020

Outside of the assertion that it is related to HIV, does the original argument around deliberate insertions hold any water?

folli · on Jan 31, 2020

Occam's razor says no. The coronavirus spike protein is responsible for receptor binding and entry into the cell. Different strains with different hosts bind to different receptors, so they have differences in their spike protein sequences. Mutations in the spike protein are expected in the evolution of coronavirus.

45ure · on Jan 31, 2020

Your reply is much appreciated.