"Bayesian" in this context most likely means naive bayes; which assumes occurrences of all words are independent of each other. The "score" of each label ends up being something like the product of all relative word frequencies, multiplied by the probability of the label itself: p(L) * p(W|L) / p(W).
But this would be super-effective if header fields like sender addresses and subject lines were somehow used as separate features, and you wanted to label based on that.
But this would be super-effective if header fields like sender addresses and subject lines were somehow used as separate features, and you wanted to label based on that.