Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There is a similar word-embedding test that definitely rustles people's jimmies:

Doctor - Man + Woman = ?

What normally comes out is Nurse. What "they" think should come out is Doctor!

By "they" I mean people that get upset by this.



Yeah, besides the fact that this compositionality is relatively unique to word2vec, research on the biases pre-trained models express is pretty available. Linked a few below for those interested. Most of the issues are down to the same phenomenon discussed here in the context of ImageNet, the input texts were biased and the algorithm learned said bias.

[0] https://arxiv.org/abs/1607.06520 "Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings "

[1] http://proceedings.mlr.press/v97/brunet19a/brunet19a.pdf "Understanding the Origins of Bias in Word Embeddings"

[2] http://matthewkenney.site/biases.html "Google word2vec biases"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: