Friday, December 23, 2005

Preventing Blog Spam

Blog spam (also called link spam or comment spam) is a form of spamming that recently became publicized most often when targeting blogs, but also affects wikis (where it is often called wikispam), guestbooks, and online discussion boards.

Any web application that displays hyperlinks submitted by visitors or the referring URLs of web visitors may be a target. In the case of blogs, spam comes if you leave the comments functionality accessible without restrictions to all visitors.

As with eMail spam, there is a clear business incentive that motivates spammers to take advantage of unprotected blogs. Adding links that point to the spammer's web site increases the page rankings for the site in search engines such as Google. An increased page rank means the spammer's commercial site would be listed ahead of other sites for certain Google searches, increasing the number of potential visitors and paying customers. As most personal blogs are still unprotected, it is likely that blog spamming will expand and that its sophistication level will continue to increase.

Blog spam can have a very damaging effect on your blog. Not only can the content be offensive, but it pollutes the healthy debate you may want to trigger by making comments available. In addition, unprotected blogs full of blog spam will reflect poorly on your technical abilities to prevent it and potentially damage your image (this is especially true if your blog is somewhat technical as most savvy people know solutions exist and will have a low forgiveness level).

Link spamming originally appeared in internet guestbooks, where spammers repeatedly fill a guestbook with links to their own site and no relevant comment to increase search engine rankings. If an actual comment is given it is often just "cool page", "nice website", or keywords of the spammed link.

In 2003, spammers began to take advantage of the open nature of comments in the blogging software like Movable Type by repeatedly placing comments to various blog posts that provided nothing more than a link to the spammer's commercial web site. Jay Allen created a free plugin, called MT-BlackList1, for the Movable Type weblog tool that attempts to alleviate this problem. Many current blog software now have methods of preventing or reducing the effect of blog spam.

Blogger (the Google blog network I am using) recently launched a word verification anti-blog-spam solution. It is based on the concept of challenge response and is very similar to what Register.com has implemented for the access to the WHOIS database fro example. If you select this option, people leaving comments on your Blogger blog will be required to complete a word verification step (see image below). What this does is to prevent automated systems from adding comments to your blog, since it takes a human being to read the word and pass this step. If you've ever received a comment that looked like an advertisement or a random link to an unrelated site, then you've encountered comment spam. A lot of this is done automatically by software which can't pass the word verification, so enabling this option is a good way to prevent many such unwanted comments.

On the Mindjet blog, we are using a plugin called Referrer Bouncer. (See:
http://blog.taragana.com/index.php/archive/word-press-1-5-plugin-referer-bouncer/)
Also, any comment posted which contains more than 10 links automatically get put in a moderation queue and an e-mail is sent to the post author. Wordpress also blocks all comments from open and insecure proxies.

In addition, there is a joint effort between Google, Wordpress and other blogging software to use a new attribute for links in comments. The attribute is rel='external nofollow' what this means is everytime hyperlink is inserted into a blogs comment, search engines will know not to use or follow the link. This makes the spam in comments useless and will not raise their search engine stats. For a complete description, go to:
http://googleblog.blogspot.com/2005/01/preventing-comment-spam.html

Because of prevention improvements in blog software link spam is now increasingly concentrated on wikis including Wikipedia, (see:
http://en.wikipedia.org/wiki/Wikipedia:Spam).

0 Comments:

Post a Comment

<< Home