Bayesian Spam Filtering

A statistical technique of email filtering. It makes use of a naive Bayes classifier to identify spam email.

Bayesian classifiers work by correlating the use of tokens (typically words, or sometimes other things), with spam and non-spam emails and then using Bayesian inference to calculate a probability that an email is or is not spam.

Certain words have probabilities of occurring in spam email and in legitimate email. For instance, most email users will frequently encounter the word "Viagra" in spam email, but will seldom see it in other email.