ReCaptcha: the whole world is digitizing books

Many times we have had to fill in those annoying little boxes with distorted letters to show that we aren’t machines. This system, which was created to prevent abuse by automated spam generating programs, is called Captcha.

Today, one of its most popular versions is ReCaptcha, which asks surfers to identify two words before registering on a website, changing user or downloading material. It has been calculated that 200 million ReCaptchas are completed a day around the world. Taking into account that solving each one takes approximately ten seconds, 150,000 hours a day are lost carrying out the operation.

But, of course, somebody has thought of a way to turn this lost time into something useful. This is the case of Luis von Ahn, one of the creators of Captcha and now Recaptcha. Among the objectives of Internet giants such as Google and Amazon is the challenge of digitizing all the books and documents created before the virtual era. This means scanning documents and processing each page to identify its content. The question lies in the fact that machines are not always capable of interpreting all the words, above all when it comes to very old books that have been damaged with the passing of time. It is here where user participation comes into play.

People are the source

It is all based on the concept of crowdsourcing. What does it mean? Basically, there are a lot of tasks, which in the past were carried out by one specific person or by a small group of individuals, that can now be carried out via mass collaboration with the participation of thousands or millions of people. This working methodology is especially common on the web, where public organizations and non-profit making organizations use it to obtain the collaboration of surfers around the world.

Recaptcha combines the concept of crowdsourcing with the idea of making the most of an available resource. In other words, the time each surfer spends on solving Captchas is also used to digitize books as well. How does it work? The tool presents users with two words, one which they recognize and one that they don’t. In this way, if the surfer correctly enters the word that the system has already identified, a second stage is entered in which the user has to recognize a term that the tool hasn’t deciphered yet. In this way the online community is working on the digitization of texts without it even having been proposed to them. Internet is giving birth to new joint collaborative working methods, which are increasingly improving our quality of life.

From here on

But, as always in this digital world, new projects are constantly appearing. The next step is to revolutionize the concept of translation. How? Through Duolingo, a system that enables users to learn and practice a language for free and at the same time help in the translation of texts into other languages.

This kind of “social translation” platform is based on the collaboration of those users who can, for example, give a score to a translation made by another surfer. In this way a self-administered system is produced, where users obtain benefits and at the same time collaborate in collective work. The creators of Duolingo say that, if they manage to muster a million users, they will have the necessary capacity to translate the English version of Wikipedia into Spanish in just 80 hours.

The web has become a reference par excellence in the management of collective work. It is a collaborative tool that enables the maximum utilization of available resources. In this way, users based in different parts of the world, without even knowing each other and even without even knowing it, are collaborating in the carrying out of tasks that benefit the community. Surfers are using their available energy in a collective fashion, almost without being conscious of the effort, but they are benefiting from the results.