On TV.com: SUPERNATURAL Breaks with a Bang
BNET Business Network:
BNET
TechRepublic
ZDNet

September 16th, 2009

Google buys reCAPTCHA: Digitize old books and fight spam

Posted by Andrew Mager @ 9:50 am

Categories: Commentary

Tags: CAPTCHA, Google Inc., Optical Character Recognition, reCAPTCHA, Andrew Mager

Captcha’s are annoying, but necessary.

They try to distinguish humans from robots when entering form data. One of the most terrifying problems with hosting your own content on the web is spam. These trolls will do anything to get you to click on something, and most of it seeps through into blog comments.

reCAPTCHA does the best job of preventing this kind of spam. They take poorly-rendered OCR scans, and display them to the user. Once a few users verify that the word is correct, reCAPTCHA confirms the word, and the book is one step closer to being digitized. From their website:

reCAPTCHA improves the process of digitizing books by sending words that cannot be read by computers to the Web in the form of CAPTCHAs for humans to decipher. More specifically, each word that cannot be read correctly by OCR is placed on an image and used as a CAPTCHA. This is possible because most OCR programs alert you when a word cannot be read correctly.

Read more here. I am really impressed that Google bought these guys (Techmeme). Google itself isn’t very good at handling captchas, but now they will use something more standard, so I’m excited. And with Google’s huge reach on the web, they will probably digitize books a lot faster than other sites.

Facebook is known for using reCAPTCHA as well:

What do you think of this? Will it help prevent spam, or will the hackers find another way? I’m interested to hear your thoughts.

Andrew MagerAndrew Mager is a web developer at Ning, Inc. in Palo Alto. See his full profile and disclosure of his industry affiliations.



Email Andrew Mager

For daily updates on Andrew's activities, follow him on Twitter.

Subscribe to The Web Life via Email alerts or RSS.

  • Talkback
  • Most Recent of 12 Talkback(s)
Re: Armor vs Artillery
If captcha's had been defeated on any wide scale, I doubt they'd be used as much as they are right now. And even if captcha's are done away with in the future, recaptcha is a great way to actually put this anti-spam system to a productive use for digitizing books.... (Read the rest)
Posted by: raynebc@... Posted on: 09/17/09 You are currently: a Guest | | Terms of Use
Too Bad ...  wkulecz | 09/16/09
RE: Google buys reCAPTCHA: Digitize old books and fight spam  ahawkinson | 09/16/09
RE: Google buys reCAPTCHA: Digitize old books and fight spam  ps.zdnet@... | 09/17/09
Uhh... It's not being further distorted. It's presented as is. nt  T1Oracle | 09/17/09
How is the input verified if they couldn't read the OCR?  eddyfaris | 09/17/09
Re: How is the input verified if they couldn't read the OCR?  hugh@... | 09/17/09
nice technique !!  damon.mac88@... | 09/17/09
RE: Google buys reCAPTCHA: Digitize old books and fight spam  satyamurti@... | 09/17/09
RE: Google buys reCAPTCHA: Digitize old books and fight spam  JD_Mortal | 09/17/09
RE: Google buys reCAPTCHA: Digitize old books and fight spam  Andrew MagerZDNet Moderator | 09/17/09
Armor vs Artillery  joetechsupport | 09/17/09
Re: Armor vs Artillery  raynebc@... | 09/17/09

What do you think?

SponsoredWhite Papers, Webcasts, and Downloads

advertisement

Recent Entries

advertisement

Archives

Favorite Links

ZDNet Blogs

White Papers, Webcasts, and Downloads

  • Smart Tech Expert advice on innovations in healthcare and the green technologies that make it happen. Find out more
  • Smart Business Discussion and advice on management issues that revolve around making your world smarter and more useful. More Smart Advice
  • Smart People The best and worst moves in the management and strategy trenches. Learn More