The shadiness about Facebook's proprietary dataset of 300 million photos is concerning and should draw more attention. At the very least it is scientifically unacceptable - we should not high-five Big Tech researchers for intentionally unreproducible research. And if Meta is harvesting user photos for AI research and commercialization, they should tell their users about it directly (I am sure there is something buried in the TOS). Does the dataset include only public photos, or are Instagram DMs fair game? Does it include CSAM? Who cares!
Serious question: who are the people in the illustrations they used in the paper?[1] Are they Facebook/Instagram users? Did the authors ask permission to use their photos for an arXiv publication? Including their kids? Meta researchers really should be answering questions like this before they are asked - but these authors didn't even include an impact statement!
At some point in the near future, Facebook will use your accumulated posting & commenting history to sell you an A.I. form of yourself that can chat with people AND keep you on teh interwebz well past your death.
Serious question: who are the people in the illustrations they used in the paper?[1] Are they Facebook/Instagram users? Did the authors ask permission to use their photos for an arXiv publication? Including their kids? Meta researchers really should be answering questions like this before they are asked - but these authors didn't even include an impact statement!
https://arxiv.org/abs/2408.12569