If I'm reading his essay correctly, it pretty directly proposes the kind of Bayesian inference called "naive Bayes", i.e. which makes an assumption of independence of the features (in this case words), and calculates the total probability of an email being spam by simple multiplication of the per-feature probabilities.
I was more thinking of his weird pre- and post-processing like >>> When new mail arrives, it is scanned into tokens, and the most interesting fifteen tokens, where interesting is measured by how far their spam probability is from a neutral .5, are used to calculate the probability that the mail is spam. <<<
Yeah, I agree he has some interesting pragmatic tweaks on it. I suppose he was proposing "[naive Bayesian] filtering" but not necessarily "naive [Bayesian filtering]".