Turing team member here. You can also try the demo at the following link which operates at pixel level for images (no meta-data which is how similar systems like search engines work). Another cool thing is that the model does inherent OCR without explicit training or inference set-up for OCR.
https://turing.microsoft.com/bletchley
(Team member of this project)
Just a clarification, both Microsoft and Nvidia have ownership of this model. Here is the Microsoft version of same announcement.
We don't have an exact date, but, we plan to share more details in a later submission. If you want access, please send an email to [turing_ AT _microsoft _DOT_ com]. Remove underscores and spaces.
“We are releasing a private demo of T-NLG, including its freeform generation, question answering, and summarization capabilities, to a small set of users within the academic community for initial testing and feedback.”
What’s the deal with these private demos? (GPT-2 was also essentially private). More importantly, why even announce the existence of a private demo to people who were not invited?
I'm honestly not trolling with this question, but can you explain what the practical applications of text generation are? From what I've seen of GPT-2, it's a cool toy, but I have never seen it create anything that seems like it would be useful to solve a problem (eg, a human-computer interaction problem).
The only applications I can think of for text generation are malevolent ones: I'm sure it would be great at generating spam sites which can fool Google's PageRank algorithms, and it seems like you could easily use it in an information warfare / astroturf setting where you could generate the illusion of consensus by arming a lot of bots with short, somewhat convincing opinions about a certain topic.
Is there something obvious I'm missing? It seems too imprecise to actually deliver meaningful information to an end-user, so I'm frankly baffled as to what its purpose is.
SQUAD and GLUE are tasks for language representation models -- aka BERT-like. This is a language generation model -- GPT-like. Hence, SQUAD/GLUE test sets are not really applicable. We are reporting on the wikitext and lambada sets that openAI also uses for similar models (numbers are in the blogpost).
* BERT & language representation models: They basically turn a sentence into a compact vector that represents it so you can then do some downstream task on it such as sentiment detection, or matching the similarity between two sentences etc.
* GPT & language generation models: Given some context (say a sentence), they can generate text to complete it, or to summarize it, etc. The task here is to actually write something.
Both are language representation models, text generation is just a way of training model. BERT is also trained on text generation task: it asked to fill gaps in text (15% of text is blanked during training).
Out of the box, given a sequence of n tokens, BERT returns a tensor of dimension (n_tokens, hidden_size) [1]. Where hidden size has no relationship with the vocabulary. You can then fine-tune a model on this representation to do various tasks, e.g. sentiment classification. Thus BERT is said to be a language representation model.
Out of the box, given a sequence, GPT-2 returns a distribution over the vocabulary [2] from which you can draw to find the most likely next word. Thus GPT-2 is said to be a language generation model.
You could of course play with the masking token of BERT call it recursively to force BERT to generate something, and you could chop off some layers of GPT-2 to get some representation of your input sequence, but I think that is a little past the original question.
One is a language generation model, the other is a fill-in-the-blank model. It sounds like they might be similar, but in practice they are different enough objectives (and in particular the "bi-directional" aspect of BERT-type models) that the models learn different things.
(Similar to the response for another question.)
BERT is a language representation model while Turing-NLG is a language generation model (similar to GPT). They are not directly comparable (they can potentially be massaged to mimic the other, but, not something that we have done yet.)
Thanks for your kind words.
Yes, we would like to next train a language representation model. And our hunch is that probably something which is a mixture of language representation and language generation would be able to get the best of both worlds.
Happy to answer any questions.