Skip to content
This repository was archived by the owner on Jun 19, 2025. It is now read-only.
This repository was archived by the owner on Jun 19, 2025. It is now read-only.

Feature Request: Free captcha service to improve verification of machine learning content #2089

@sandreas

Description

@sandreas

There is already an excellent web application to get new language data. Would it be possible, to improve the quality of verification tasks by implementing a free captcha service like reCAPTCHA? In my opinion google heavily uses the "user feedback" of reCAPTCHA to improve their machine learning tasks.

I would prefer using an open source product instead of reCAPTCHA on my web pages. The whole app should be localized, so that every language could be improved.

I thought about the following possibilities:

  • type in the words you hear: Cut unqualified creative commons audio data and make it useable for machine learning tasks via user feedback
  • select the words you hear - if there is already some feedback but not enough verification
  • order the following audio samples by quality - to get a bunch of high quality samples
  • please choose the pictures that match your language - to improve dictionaries and provide a visual alternative to the audio parts...

Problems:

  • DeepSpeech could be used, to break the captcha itself
  • The data quality decreases because of bots trying to solve the captchas

What do you think? Is it worth the effort?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions