Battle comment spam with an extra form field
Posted on August 01, 2007. 37 comments.
I was introduced to an ingenious solution to the problem that is comment spam.
All you have to do is add an extra input field to your comment form. The field should have a name that you won’t use or save, but at the same time, have a name that it’s certain that a spambot will fill in. Example:
<input type="text" name="lastname" id="lastname" />
I don’t store any users last name with a comment, so that should do just fine. Now comes the really clever part. Add a CSS rule to your stylesheet which hides this field from regular users, while still leaving it in the html for every spambot to see:
input#lastname { display:none; }
Lastly, change the part of your site that adds a comment to your database so that a comment is not added if the field “lastname” is filled in.
And there you go: nearly perfect comment spam protection. No need for bayesian filters, API-keys or (gasp) captchas. I can’t understand why this approach isn’t common, so what have I overlooked? :)
Update: I just got spammed again, but I’m not sure what happened. The spam comments was actually left on an entry where the comments are closed. Alas, I must dig deeper to unravel this mystery.
I still think the technique outlined in this post will work, and there’s been some interesting additions made in the comments.
37 comments
Oliver says:
The only caveat I would add is that a person using a screen reader will be presented with the form field, as the screen reader will disregard the CSS rule. Therefore, they will be unable to make any posts, if they fill in their last name.
Kristin K. Wangen says:
Do you have any example of a php code that says “Don’t post this to the database if”. I just thought this could have been a briliant plugin for WordPress. But all I can find is plugins that adds things to the comment form (like captchas), so I’m not sure who to do this.
Anyway, this won’t protect against trackback spam, but it’ll be better for people who comes by a blog to write a comment.
Oliver says:
The only issue I can see is that a person using a screen reader will be presented with the ‘lastname’ text field as the screen reader will disregard the CSS rule. Therefore, if they fill in the text field their post will be treated as spam by the system. Also, people using text-only browsers will have the same problem.
Siep says:
That’s pretty sweet! Never thought of it before.
Tor Løvskogen says:
Clever!/Smart!
Martin Bekkelund says:
Excellent idea! However, you must be able to change the comment parser of your publishing tool, so that it will filter the comments. Not everyone is capable of doing this, so it would be interesting to see a plugin for this purpose.
Olav says:
Thanks for the replies, guys (and gal).
@oliver: That’s a very good point. I’m not sure how you could counter that issue though. Maybe there’s a standardized class you could give an element so that screen readers will ignore it.
@Kristin: I guess it would be easy making a WP plugin for this. I’m no php guy, but in quasi/pseudo code, it would be something like:
@Martin: That’s true, but it is also easier than many other approaches. But, as mentioned above, it should be a plugin. You should write one for WP. ;)
Alex says:
Thats an awesome idea I’m gonna try to add that to mine asap
Olav says:
True story: Since yesterday, a herd of spambots have been attacking this site, giving me hundreds of shifty comments to go through.
After adding this simple fix, not one has gotten through.
Kristin K. Wangen says:
As of screenreaders, isn’t it possible just to add a note that says ‘don’t fill this in’ in the code. The bots don’t care or read it, so that might work.
Kim Joar Bekkelund says:
A little addition to this can be removing the element with JavaScript. This way it doesn’t matter if the spam bots also parse the CSS. By having it hidden in the CSS, and not just removing it with JavaScript this still works when JavaScript is turned off.
And also by adding the display: none on a div or any other tag that encloses the hidden form field, the spam bots have to be really good since they have to parse the CSS (and JavaScript if you add the above functionality) and not just check the visibility of the exact element that they are typing text into.
Olav says:
@Kim: Yeah, those are some good ways to get more protection. But I should really learn more about how these bots work. I’ve always assumed that they don’t parse CSS or JS, but who knows..
And if any PHP-literati are reading along, this technique would make for a great WP plugin. ;)
Oliver says:
First off, sorry for posting the same thing twice. I assumed the system was automatic, so after the post not being published after a few hours I tried again.
@Kristen: If we fill the field in with ‘don’t fill this is’ both humans and bots will ignore it, so everyone is ignoring it and we end up back where we began. It would be different if we typed ‘delete this text’.
@Kim: The problem with using JS is the same as with using CSS, if the user has it turned off this method will not work. I realise that it will only be a minority of cases where JS and CSS are disabled, however, the system should be able to cope with these users.
An alternative idea is to use a method similar to the one Zeldman uses. He adds a question field (Is ice hot or cold?) and only posts the comments with the correct answer.
Really this is an issue of accessibility as with a modern, fully-functioning web browser (i.e. CSS and JS enabled) there is no problem in creating a solution. However, we have to cater for those who have certain technologies unavailable to them for whatever reasons. It would be a major step backward for web design to only cater for those with modern browsers / technologies.
Olav says:
@oliver: Yeah, I had to turn on comment approving because of spam, but this method of an extra field solved that. :)
I think this is a better method than the “ask the user a simple question” technique, as this is totally hidden from the average user, thereby lowering the bar for leaving a comment. It also seems to work just as good against spam.
As far as the screen reader goes, isn’t the easiest solution to just add a text next to the input field that explains that the user should not enter their last name, as this is an anti-spam measure, and then hide that text with CSS as well? I don’t think spambots parse and understand the english language. ;)
Kim Joar says:
@Olav: Think about reading CSS and not parsing it and viewing it as a browser shows a page; it will still be possible to write a reasonably simple regex that checks the CSS whether or not this element is hidden or not. This is what I wanted to change by adding the display: none on an enclosing element. This way the bot has to check the elements enclosing the field it is checking.
@Oliver: You just gave me a much better idea; what about mixing this idea with Zeldman’s system. You have a field with a question, you hide it with CSS and you remove it with JavaScript. If the input field does not contain text you accept it as not spam. If the input field contains text you check whether or not it is the correct answer. This way you are secure as hell, and the user does not have to enter text if he/she has either CSS or JavaScript on! And this way we also solve the accessibility problem.
Olav says:
@Kim: Haha, well now there’s no turning back. You HAVE to make this plugin after that. ;p
Kim Joar says:
Never liked WP, so doubt there will be a WP plugin from me. But I will write a post with the needed PHP, CSS and JavaScript on this as soon as my blog is up and running (hopefully today, most likely tomorrow).
Oliver says:
@Kim: That is the most ridiculous idea I have ever heard… I love it!
I think we can all agree that there is no easy answer, but personally I think Olav’s original idea is the best. However, I would add a comment by the side indicating that it is an anti-spam measure to protect users of screen readers. (Obviously the message would be hidden in normal mode as well).
Niclas says:
Awesome! You HAVE to release this as a simplelog plugin :)
Will Morgan says:
Why not just use Akismet? All you need is a Wordpress API key (free) and it eradicates all spam for you.
hcabarcas says:
I’ve been using this technique for almost 5 months now. It cut down the amount spam I was receiving through my site and my client’s contact form to zero…zilch…nada. Works perfect for my needs. I will have to consider the issue brought up about screen readers, though.
Mike Schinkel says:
Hmm. Sounds like a really good idea. Except.
I’m pretty sure Google is penalizing your blog for your hidden content. See these links:
http://www.google.com/support/webmasters/bin/answer.py?answer=66353 http://www.seologic.com/faq/hidden-text.php
Even though you are using it for legitimate reasons I’m pretty sure the borg (that would be Google, not Microsoft, in this case) doesn’t differentiate and is omitting your blog posts URLs with the hidden lastname field from it’s index. I googled for your domain plus the word “spam” and only found pages that did not have a comment form such as your home page, tag pages, and monthly aggregation pages. Try it for yourself:
http://www.google.com/search?q=site:bjorkoy.com+spam&filter=0
That said, I think your idea still has some merit. What if you simply added a visible field and put a label that says “Do NOT fill in this field (spam protection)?” If you wanted you could also Javascript to hide this, but you’d really have to obfuscate the approach otherwise Google might figure it out.
Alternately you could inject the comment form into your webpage using Javascript. That way the spam bots would never even see a comment form! If you do that you might want to generate a code on server-side that you’d put into a hidden form field to match before you accepted the comment to guard against someone actually manually figuring out your setup, but that would only be important if lots of people started using an identical technique (i.e. via a WordPress plug-in or similar.)
Mike Schinkel says:
I just noticed your comments don’t show a date the comment is made. That’s a shame; it would be nice to see when the comments were made.
Damon Haidary says:
Well, in theory screenreaders should ignore the content that’s set to display: none; because the W3C states that should completely suppress the rendering of the element including audio rendering. Indeed some screenreaders do ignore it but some don’t so it’s not safe to use it in this way. As others have mentioned you also have the problem of text-only browsers. The best thing you can do is what Kristin said and just label the field with “Leave this field blank” or something.
What I’ve been using successfully for a while is “Kitten Auth” (http://www.thepcspy.com/kittenauth). The idea behind it is you give the user a grid of images and have them pick all the ones that meet certain criteria. For animals this could be “Choose all the kittens” or “Select all animals with fur” for example. This is very much like the text based one Oliver mentioned for Zeldman’s comments.
I prefer it over the plain text captcha because I can envision spam bots getting smart enough to use search engines to determine the answer to simple questions like “Is fire hot or cold?”. Kitten Auth also obviously has more severe accessibility problems than the other two methods so that’s something that must be worked around.
The key to beating the bots is making it impossible to fingerprint a captcha style. If everyone made slight customizations to their captcha systems spammers would have a much harder time writing generic code to beat them. So whatever you use, just make sure it’s unique.
says:
test
ccheney says:
Neat, I will see how this works out.
Adelore says:
But is it autofill proof?
Sally, design guru says:
I experienced the same problem. Your technic of filtering spam doesn’t work properly.
徴信社 says:
Well, in theory screenreaders should ignore the content that’s set to display: none; because the W3C states that should completely suppress the rendering of the element including audio rendering.
Heal says:
Whoa I’m researching for my weblog which I’m about to start. Spamming is a big problem! I’ve heard that Askimet wordpress plugin works well. Your opinion?
Traveler says:
We are getting a lot of spam. It’s so much that we have to clean dozens of trashmails every day, selecting correct submissions from spam. It’s hard work.
I am going to look deeper into what you propose here, maybe it works…
徵信社 says:
good~
dıs cephe says:
But is it autofill proof?
Joe says:
I have found Askimet and Spam Karma to be pretty effective too. I’ll give this a test also. I don’t see Google looking negatively on hiding a empty input box and a couple of words but that is just my opinion on that :-)
hekimboard says:
good job.
fibercement says:
thanks a lot.
turksiding says:
All you need is a Wordpress API key (free) and it eradicates all spam for you