Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't think there is any evidence of a language here unless you stress the definition to the point of absurdity. It will not even reliably produce the same kinds of images that had the text that it output, which was the original premise of the claim. Obviously, probing some overconstrained high dimensional space where it's never rewarded for uncertainty has to produce something; that doesn't mean that something is a language.


> Obviously, probing some overconstrained high dimensional space where it's never rewarded for uncertainty has to produce something; that doesn't mean that something is a language.

But if one of those probes towards unrewarded input produces a correlation then SOME side effect is influencing it. It means there can be side effects ALL over the unrewarded space.

That being said the reward space VS. unrewarded space is tiny. 1 over infinite for all intents and purposes. It's basically all combinations of letters in reality vs. all possible grammatically correct English combinations.

The unrewarded space is massively huge. Within that unrewarded space given how massive it it is, there is actually a very high probability that there is at least several sets of inputs in there that form a grammar and a consistent language with consistent outputs.

But these sets are hard to find you can't just pick anything. If you pick one secret word that has a correlation with birds, then you mash it up with english expecting it to stay coherent... well that's simply an invalid set.

You need to find the OTHER words that work in conjunction with the secret bird word. There are millions likely, but they will likely be impossible to find. Library of Babel vibes, https://maskofreason.files.wordpress.com/2011/02/the-library....

This could actually be an interesting project. Some algorithm that explores this space attempting to map out connections. It would have to be another ML algorithm, but likely that search is still never ending; but like the library of babel, in terms of probability something must be out there that works.


> The unrewarded space is massively huge. Within that unrewarded space given how massive it it is, there is actually a very high probability that there is at least several sets of inputs in there that form a grammar and a consistent language with consistent outputs.

I think this is a large logical leap. The unrewarded space may be huge, but it is not large enough that it's almost guaranteed (or even close) that we'll find something that looks like language in there. If we did find something that looked like language, even in the unrewarded space, it would be very surprising, which is why the initial post that inspired this response was so talked about! But we have not.


Given the size of the space and the fact that we already found one word it indicates that the probability is high enough that tons of other things like languages exist in that space.

It's like if I gave you a lottery ticket and you won. The lottery ticket is more likely to be rigged then you actually winning. Or in other words, the fact that we even found a secret word is indicative that there's a lot going on in the unrewarded space.


The word wasn't selected randomly, though, so it doesn't tell us much about the broader unrewarded space. And there's a fair amount of evidence that (1) the tokenization of the word includes a bunch of syllables that correspond to the Latin names of birds, and (2) this same technique doesn't work with other reproduced nonsense text (i.e. using the nonsense text produced when asking about workers working in offices gives you more nature scenes, not offices). This makes it difficult to argue that what is going on here is even a secret vocabulary, let alone a secret language.


I mean the attributes you mentioned just reduce the search space.

Something is going on if even some random word produces a correlation of some sort.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: