Hacker News new | past | comments | ask | show | jobs | submit login

The combination of the Voynich Manuscript's history of translation claims, taken together with the origin of this work (http://www.telegraph.co.uk/education/educationnews/3326436/I...), leads me to view this with at least something of an air of caution.



I read the paper and found it to be intelligible and interesting, making reasonable attempts to justify sound, plausible hypothesis. Obviously he may not be right, but his explanations are well worth reading.


This describes many, many translation attempts of the Voynich Manuscript, though. At this point it's hard for me to consider another one of these attempts worth reading, because they all seem to follow the same pattern.

Try and identify plants/stars/places in the text by their real names, work backwards from those names, find some other semi-intelligible parts of the manuscript based on that, then handwave away the 80% of the manuscript that has to be gibberish because the information density is too low to be some kind of natural language... why is another attempt along these lines interesting?


then handwave away the 80% of the manuscript that has to be gibberish because the information density is too low to be some kind of natural language

Well, if you'd read more carefully, you'd know that Stephen Bax is precisely claiming that it is a natural language. And I don't know as much as you regarding the other translation attempts, but from what I gathered, they made assumptions which are incorrect from a linguistic perspective (e.g words "too long" or "too short" for it to be a natural language, which in fact only proves that it's likely not a european language)


I think you mis-read that sentence. The 'because ...' applies to 'has to be gibberish' not 'handwave away the 80%'. He's saying that it is impossible to be a natural language because the information density is too low and that this author as well as all others willfully ignore that reality.


So, presumably you think these folk are wrong then -

Probing the Statistical Properties of Unknown Texts: Application to the Voynich Manuscript

While the use of statistical physics methods to analyze large corpora has been useful to unveil many patterns in texts, no comprehensive investigation has been performed on the interdependence between syntactic and semantic factors. In this study we propose a framework for determining whether a text (e.g., written in an unknown alphabet) is compatible with a natural language and to which language it could belong. The approach is based on three types of statistical measurements, i.e. obtained from first-order statistics of word properties in a text, from the topology of complex networks representing texts, and from intermittency concepts where text is treated as a time series. Comparative experiments were performed with the New Testament in 15 different languages and with distinct books in English and Portuguese in order to quantify the dependency of the different measurements on the language and on the story being told in the book. The metrics found to be informative in distinguishing real texts from their shuffled versions include assortativity, degree and selectivity of words. As an illustration, we analyze an undeciphered medieval manuscript known as the Voynich Manuscript. We show that it is mostly compatible with natural languages and incompatible with random texts. We also obtain candidates for keywords of the Voynich Manuscript which could be helpful in the effort of deciphering it. Because we were able to identify statistical measurements that are more dependent on the syntax than on the semantics, the framework may also serve for text analysis in language-dependent applications.

http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjourna...


Oh, I don't really have an opinion either way or the other (as I don't have enough information on it). I was just clarifying the intent of my grandparent.


Well, in clarifying someone else's intent you appear to have gone a bit further and seemed to be supportive of them about what is and isn't reality in this case, which doesn't seem to make much sense if you don't actually have an opinion on this subject.

How do you know their intent anyway? Are they actually your grandparent?


I think you mis-read my original post. The last sentence had this structure: "He's saying ... and ..." (i.e., he said both those things [including the one about reality]).

And, I knew his intent based on reading comprehension and context.


At this point it's hard for me to consider another one of these attempts worth reading

Then your commentary on the subject is utterly worthless.


How many papers on perpetual motion machines have you wanted to read in the last year?

Does that make any commentary that you might make on the subject utterly worthless?


To dismiss a paper on perpetual motion machines you just need to know the implications of the first or second law of thermodynamics, about which there is, for all intents and purposes, unanimous consensus.

To dismiss attempted translations of the Voynich manuscript, you'd need to read the attempted translation because there is no consensus whatsoever about the manufascript, except for the fact that it is untranslated.


That would only be true if every attempted translation were unique.

In other words, once you've dismissed one paper barking up the wrong tree you can confidently dismiss all the others that appear to be barking up the same tree.


Perpetual motion machines are known to be metaphysically impossible. Decoding this is not.


Because interpreting an unknown language equals describing how you made something physically impossible, right?


Ouch.


I probably shouldn't post when I'm feeling ill and grumpy.


It's pretty dubious that he says "I hit on the idea of identifying proper names in the text", like it was a novel idea.

One recent attempt: same method, drastically different results: http://cms.herbalgram.org/herbalgram/issue100/hg100-feat-voy...


Remember, journalism. If he even said that, his next sentence might have been "Of course, that had been tried a thousand times and wasn't very new by itself."


Hm, if a portion has substantially low entropy compared to natural language, maybe it was records of something?


Is it possible that the the author intentionally duplicates word for the sake of obfuscation, thus leading to the information density problem?


It's more likely (if we take on the assumption that this is natural language) that the author is not writing in a written language style. It can be hard to appreciate how much spoken language is a literate society is influenced by writing; much of that is due to the fact that we largely keep our memories outside of our skulls these days. Oral language, especially that which is meant to be passed along, is full of repetition, alliteration, rhyme and pun. If the script is something more like an abjad than an alphabet, the repetition may not necessarily be repetition. That may pose additional serious problems if the language (if it is a real language) is an effective isolate (having no documented surviving relatives). An abjad is hard enough to deal with when you have a Semitic-style non-concatenative morphology (words of related meanings have consonants in common, with the vowels changing), but if there is punning without semantically common roots, then all you know is that you have words close to each other that use the same consonants.


Given how hard it is for even brilliant linguists to find academic posts, I wouldn't discount the author of the grounds of his home university alone.


Or, you know, you can see what he says and judge for yourself.

The Swiss patent office didn't have much scientific output either, until it had.


What is the relation with Luton? Can't see any mention here: http://stephenbax.net/


University of Bedfordshire is a result of a merger of a Bedfordshire campus of De Montfort University and University of Luton.

http://www.beds.ac.uk/aboutus/history

The linked story is from before the merger (2005), and before the center Bax works at was founded (at a third campus).

I wouldn't say that story has much relevance here. You'd want to look into the credentials of his language center and Bax himself (which look decent to me, though I don't know anything in particular.)


However, if you given an appropriate weighting for The Telegraph and consider the influence of Betterige, any reasonable Bayesian analysis would conclude from this evidence that Luton can't be all that bad.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: