There is a similar word-embedding test that definitely rustles people's jimmies:...

tastroder · on Sept 24, 2019

Yeah, besides the fact that this compositionality is relatively unique to word2vec, research on the biases pre-trained models express is pretty available. Linked a few below for those interested. Most of the issues are down to the same phenomenon discussed here in the context of ImageNet, the input texts were biased and the algorithm learned said bias.

[0] https://arxiv.org/abs/1607.06520 "Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings "

[1] http://proceedings.mlr.press/v97/brunet19a/brunet19a.pdf "Understanding the Origins of Bias in Word Embeddings"

[2] http://matthewkenney.site/biases.html "Google word2vec biases"