Spam Filtering

Bayesian Spam Filtering Still Proves Reliable

by Christopher on June 27, 2011

The first anti-spam filter to use Bayesian filtering techniques, Jason Rennie’s iFile program, was released in 1996. In 2002, programmer and venture capitalist Paul Graham tweaked the technology to greatly reduce the false positive rate, making Bayesian methods capable of standing on their own as an anti-spam filtration system. Now, 15 years after its debut, Bayesian filtering technology is still in use by leading anti-spam vendors.

Bayesian filters are based upon the Bayes Theorem, devised by an 18th century British minister and mathematician named Thomas Bayes. In short, the Bayes Theorem is a statistical method of determining probability. In terms of spam filtering technology, the theorem’s application involves reading the content of an email message and comparing its words and phrases to the content of known spam. If a significant percentage of the words in an incoming message are common in spam, the new message is likely to be spam. If a statistically irrelevant number of words are typical of spam, the new email is probably legitimate.

What gives Bayesian filters an edge, even in light of the endless ruses spammers devise to trick the system, is that they learn. The more spam and legitimate correspondence the filters see over time, the more data they have with which to make statistical probability determinations. Also, when email users correct the filter by identifying false positives or flagging spam that manages to get into the inbox, Bayesian filters pick up new relevant information.

Most importantly, Bayesian anti-spam filtering technology learns on an individual basis. Over time, a user’s email account receives significant amounts of email correspondence from many of the same people and mentioning many of the same things. Similarly, individuals are prone to receiving particularly high quantities of certain types of unwanted bulk email as a side effect of their daily online activities. A Bayesian-based email filter continually obtains new statistical data specifically relevant to the user.

Of course, no system is perfect on its own. Bayesian filters, like other text-based methods, are susceptible to “poisoning.” For example, spammers attempt to fool the filter by pasting in a significant number of words completely unrelated to their spam message. The text may be a copied portion of an online article, or even words automatically randomly inserted from an online dictionary. This waters down the number of common spam terms and phrases, making them statistically irrelevant to the filter.

Spammers have also had luck bypassing Bayesian anti-spam filters by misspelling words, inserting characters or spaces in the middle of words, and replacing letters with digits, as with a 1 in place of an I or a 3 in place of an E. Bayesian filters are continually adjusted to account for such tricks. The battle between spammers and spam filtering technology will undoubtedly go on indefinitely. It is a perpetual series of one side outmaneuvering the other.

The more effective of today’s anti-spam filters rely on multiple filtering technologies that, when used in conjunction, make correct determinations about whether an incoming message is spam with astounding accuracy. Bayesian-based methods are still earning their keep though, and are incorporated into many of the leading anti-spam filters available today. Particular success has been found by combining Bayesian methods with IP address reputation filtering. Products based on this particular combination are widely regarded as the strongest, smartest, most reliable anti-spam filters available.

Be the first to comment

Advantages of Outsourcing Your Anti-Spam Protection

by Christopher on April 21, 2011

You’re probably well aware of the basic benefits of spam protection for your business’s email accounts. Effective spam filtration saves employee time, which in turn means boosted productivity. And of course, it just makes checking your email much less annoying. A top-notch spam filter also protects against a clogged email server and obscured legitimate electronic communications amidst mounds of junk mail.

But many companies opt to have their anti-spam protection managed in-house by the IT department. Some think it’s cheaper, others think it’s more reliable, while others simply don’t think of bringing in outside help for the battle against unsolicited bulk email. There are several key advantages to outsourcing your business’s spam filtering needs, though.

Smart businesses realize that anti-spam protection isn’t an overhead expense, it’s an investment. The returns on this investment can be maximized by outsourcing spam filtering.

The personnel in the IT department won’t have to devote time to tasks that could easily be handled elsewhere, which means they can focus on more important jobs that directly relate to the company’s daily operations and success. In addition, a company can avoid tying up the IT department on anti-spam and other email-related troubleshooting matters.

By using a provider for spam filtering, your business will save money and help optimize internal systems efficiency. Taking on the responsibility of your own spam filtering means more bandwidth use and expenditure, more storage space occupied, and a lot more CPU effort. The latter can seriously slow computer functions, as all of the processing required to filter spam is a significant, round-the-clock burden. Your company is also probably paying annual subscription fees as part of managing its own email security.

Outsourced anti-spam protection is typically far more efficient than in-house solutions. Leading spam filters also require no administration on your part, meaning you avoid the hassles involved with keeping up-to-date on the latest threats, blacklists, wordlists, and other aspects of the ever-evolving fight against spam.

As an added bonus, outsourced spam filtering readily scales as needed, unlike most self-regulated systems. A company’s own spam filters are generally not built to handle floods of incoming email, such as those seen in directory harvest and distributed denial of service attacks; and if they are, they are over-built to incorporate capabilities rarely or never used in ways that waste money and resources. Externally-managed email security systems are always prepared to stop onslaughts of spam without interfering with regular email delivery or bringing internal systems to a crawl or to a dead stop.

Perhaps the most critical benefit to outsourced spam filtering, however, is the increased email security provided by professionals. A sophisticated spam filter blocks even the newest virus and other malware threats in ways most in-house systems cannot come close to matching. Malware threats delivered via spam messages bring a variety of dangers into internal systems. For just a few examples, computers can be hooked into botnets, or a company’s proprietary information can be compromised, as can private information of customers and others the company deals with. Outside managed email security can also better protect against the latest targeted phishing ploys that can have similarly dire implications.

While a business is certainly correct in making anti-spam protection a priority and in viewing it as an investment rather than a cost, it is shortsighted to view in-house efforts as the best way to fight spam. By outsourcing spam-filtering needs, a company can reap greater returns on anti-spam investments. The business will also keep important resources freed up for more pressing matters, avoid slowing down systems unnecessarily, prevent the wasted resources and costs associated with worst-case scenario designs, and, most importantly, enjoy much safer email security.

Be the first to comment

Essential Features to Look for in Spam Filtering Technology

November 12, 2010

IP ADDRESS REPUTATION-BASED FILTERING Traditionally, spam filters depend largely on text analysis. While effective much of the time, content analysis filtering is not nearly reliable enough to stand on its own. Spammers are smart enough to alter their words in ways that bypass most text analysis. IP address reputation filters, when used as a first […]

Read the full article →