778

Anyone who has been surfing the web for a while is probably used to clicking through a CAPTCHA grid of street images, identifying everyday objects to prove that they're a human and not an automated bot. Now, though, new research claims that locally run bots using specially trained image-recognition models can match human-level performance in this style of CAPTCHA, achieving a 100 percent success rate despite being decidedly not human.

ETH Zurich PhD student Andreas Plesner and his colleagues' new research, available as a pre-print paper, focuses on Google's ReCAPTCHA v2, which challenges users to identify which street images in a grid contain items like bicycles, crosswalks, mountains, stairs, or traffic lights. Google began phasing that system out years ago in favor of an "invisible" reCAPTCHA v3 that analyzes user interactions rather than offering an explicit challenge.

Despite this, the older reCAPTCHA v2 is still used by millions of websites. And even sites that use the updated reCAPTCHA v3 will sometimes use reCAPTCHA v2 as a fallback when the updated system gives a user a low "human" confidence rating.

you are viewing a single comment's thread
view the rest of the comments
[-] Blackmist@feddit.uk 43 points 2 months ago

Aren't these Captchas designed to get training data for AI models anyway?

"System does what it was designed to do" doesn't feel that surprising...

[-] aidan@lemmy.world 4 points 2 months ago

Aren’t these Captchas designed to get training data for AI models anyway?

Yes and no, the captchas are just meant to be hard for computers to solve but easier for humans. People saw that, and thought that "if we're making people do this might as well have them do something useful" not meant to be malevolent- and the purpose is still stopping bots, training them is a side-effect.

[-] finitebanjo@lemmy.world 3 points 2 months ago

No, you're wrong, the Traffic Light examples ARE specifically to gather data to train models. Being a good Captcha was just a byproduct of that. If people just wanted a good captcha they wouldn't need hundreds of millions of photos of street lights and bicycles.

[-] aidan@lemmy.world -2 points 2 months ago

No, you’re wrong, the Traffic Light examples ARE specifically to gather data to train models.

No you're wrong, because the sites that embed those captchas on their page are not doing that to help good.

If people just wanted a good captcha they wouldn’t need hundreds of millions of photos of street lights and bicycles.

Yes, they are getting something productive out of the human labor that would be done anyways. Trust me as a web developer, and web scraper, some kind of captcha is necessary for many free services to be useful/economically viable. The core of a good captcha is just making it marginally more expensive for the scraper/bot than it is for you.

[-] finitebanjo@lemmy.world 2 points 2 months ago

The sites don't create the captcha, you yourself just said it was embedded there.

[-] aidan@lemmy.world -2 points 2 months ago

They embed for a reason... And the captchas wouldn't exist if they weren't embedded anywhere

[-] Guest_User@lemmy.world 1 points 2 months ago

Finitebanjo is right. Yes they are used to fight spam and bots but they way they do it us is picked intentionally to train ai.

https://medium.com/@yennhi95zz/how-google-trains-ai-with-your-help-through-captcha-876cb4eb4d01

Also from the Wikipedia article "Google profits from reCAPTCHA users as free workers to improve its AI research." https://en.m.wikipedia.org/wiki/ReCAPTCHA

[-] aidan@lemmy.world 0 points 2 months ago

they do it us is picked intentionally to train ai.

Yes like I said, the challenges were picked to be useful. But some form of challenge would've been chosen regardless.

this post was submitted on 27 Sep 2024
778 points (98.4% liked)

Technology

59982 readers
2384 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 2 years ago
MODERATORS