Except it's unlikely because they have a bit larger context and are modeling a b...

TeMPOraL on Nov 18, 2023 | parent | context | favorite | on: What Meta learned from Galactica, the doomed model

Except it's unlikely because they have a bit larger context and are modeling a bit more than one-dimensional probabilities of "what comes after this?".

I'd say that LLMs are more resistant to GIGO, as long as the fraction of garbage in training data is small - it'll look like outliers to the larger model.