BERT paper is from 11 Oct 2018 and the OpenAI paper it was referring to "Improvi...

lolinder · on Feb 15, 2024

Yes, but that paper that you link to doesn't ever call it GPT or Generative Pretrained Transformers. It talks about training Transformers with Generative Pretraining, both of which are pre-existing concepts by this point.

I also looked on the OpenAI website in Sep 2018 and could find no reference to GPT or Generative Pretrained Transformers, so I think OP might be right about BERT using it first.

http://web.archive.org/web/20180923011305/https://blog.opena...