On CHOW: Did you leave a huge tip?
BNET Business Network:
BNET
TechRepublic
ZDNet

December 8th, 2008

Audio CAPTCHAs easy to crack, according to researchers

Posted by Andrew Nusca @ 10:59 am

Categories: Security

Tags: CAPTCHA, Audio, Andrew Nusca

Audio CAPTCHAA great story over at Ars Technica today details the efforts of the Carnegie-Mellon University team behind the reCAPTCHA service, who has turned its attention to the audio CAPTCHAs used by the visually impaired.

These audio CAPTCHAs consist of a string of spoken characters, typically masked and distorted by a form of background noise.

The scientists looked into the security of existing audio CAPTCHAs used by Google and Digg, and found that they are relatively easy to crack. Ars Technica’s John Timmer describes the process in detail:

The work involved gathering 1,000 audio CAPTCHAs from Google, Digg, and the reCAPTCHA service. 900 of these were used as a training set and the remaining 100 were set aside to test the system when done. The software first did a rough audio analysis, dividing each item into equal-sized chunks, each sufficiently long to fit any spoken character. Those segments with the highest energy peaks, which are considered most likely to contain actual letters, were set aside for analysis.

The authors tested a number of methods used to extract features from recordings of speech (for the curious, these are mel-frequency cepstral coefficients and two forms each of perceptual linear prediction and relative spectral transform-PLP). These features were then subjected to analysis using machine learning programs, which were trained on the identification of individual characters. Three methods—AdaBoost, support vector machines (SVM), and k-nearest neighbor (k-NN)—were trained using the 900 audio CAPTCHAs that had been processed manually. The result of this pairing of processing and analysis methods was a total of 15 different attempts at cracking each of the 100 test audio CAPTCHAs.

Apparently, Google’s audio CAPTCHAs, which consist of a series of the digits 0 through 9 recited over background noise of speech played backwards, were nowhere near consistent enough to fool the researchers’ software: the SVM technique got the CAPTCHA right about two-thirds of the time, and AdaBoost wasn’t far behind, with k-NN performing poorly in the test. ). For Digg, their audio CAPTCHA uses both digits and letters, but plays them over “a less complex background that sounds like flowing water.” AdaBoost failed the test, but SVM was able to clear 70 percent accuracy with k-NN trailing by a significant margin.

There’s more detail in the article, but the bottom line is this: Based on the results, audio CAPTCHAs need more of just about everything: more speakers, more characters, more distortion, and longer strings of tokens.

As a result, reCAPTCHA has expanded its own service to include all numbers from 0 to 99.

Andrew NuscaAndrew J. Nusca is an associate editor for ZDNet and SmartPlanet. See his full profile and disclosure of his industry affiliations.


Email Andrew NuscaFollow on Twitter

Subscribe to The ToyBox via Email alerts or RSS.

  • Talkback
  • Most Recent of 6 Talkback(s)
RE: Audio CAPTCHAs easy to crack, according to researchers
Dear sir
Take my salam
we 7 years experience in this field. we have 30 pc 90 worker & we have 24/7 nonstop support worker. If you have posible pls send me your captcha work, our contact number... (Read the rest)
Posted by: sumon234 Posted on: 01/09/09 You are currently: a Guest | | Terms of Use
I'm visually/hearing impaired.  Grayson Peddie | 12/08/08
Thanks for your comment.  andrew.nuscaZDNet Moderator | 12/09/08
re: I'm visually impaired/hearing impaired  crabitha | 12/11/08
crack maybe - use? nada.  zclayton2 | 12/11/08
RE: Audio CAPTCHAs easy to crack, according to researchers  lights_rage | 12/18/08
RE: Audio CAPTCHAs easy to crack, according to researchers  sumon234 | 01/09/09

What do you think?

SponsoredWhite Papers, Webcasts, and Downloads

advertisement
Click Here

Recent Entries

Premier Vendor Content Whitepapers, webcasts & resources from our Power Center Sponsors
advertisement

Archives

Favorite Links

ZDNet Blogs

White Papers, Webcasts, and Downloads

SmartPlanet

Click Here