CAPTCHA & reCAPTCHA

If you’ve filled out a form online you’ve likely seen a box asking you to prove that you are not a robot. Many people are confused why they must go through this process - after all why does a website care if someone is a human or not. The reason is more simple and important than you may think. First things first though, that box you see is called a CAPTCHA or reCAPTCHA. CAPTCHA stands for Completely Automated Public Turing Test To Tell Computers and Humans Apart and it was invented to stop ‘spam bots’ or just ‘bots’ from pretending to be humans. The reason these bots were trying to disguise themselves as humans was to flood website forms with spam data or try to catch people with scam emails. CAPTCHA was created in 1997 by two groups of developers. The idea was to ask the user to unscramble some text to prove that they were not a bot. Since text recognition wasn’t very good yet having these images of text with lines through them was enough to confuse the bots and stop them from proceeding. The early CAPTCHAs looked like this:

Over time, one of the original developers realized that CAPTCHA could be used for more than just identifying bots vs humans. The idea was that since people had to fill out this information anyway they could also be used to digitize old books and newspapers that text recognition software was trying to read. So they changed to look like this:

The first word was meant to determine that the person was a human but the second was an image from a book or newsletter that was in the process of being changed to digital. They figured if the person could correctly identify the letters of the first word they would most likely correctly identify the second. This let more than 13 million articles from as far back as 1851 be changed into a digital version - so thanks for helping with that! Eventually this version had to be replaced because the bots got advanced enough to solve the CAPTCHA 99% of the time. That's why in 2007 Google introduced reCAPTCHA v2 which looks like this: This version is much more stable and secure than its predecessors. It also is an easier process for the user as they simply need to click the checkbox to proceed. The reason this works is because of what the CAPTCHA is doing in the background. The CAPTCHA scans through everything you did from the moment you arrived on the site to the time you press the checkbox; it checks how many times you clicked, how long you looked at one part of a website, etc. This reason it does this is because a bot doesn’t behave the same as a human; they are built to immediately navigate to the website form and fill in the information. On the other hand a human will take time to look at things and click on the links. While this does work most of the time there are some times were the CAPTCHA can’t tell if you are a person or not. In this case it will ask you to identify pictures similar to this:

In a similar manner to the original reCAPTCHA this process is helping improve technology. In this case instead of changing paper articles to digital ones, this version is helping teach Google’s image recognition software how to better identify objects such as street signs or cars. This is then implemented into applications like Google Maps or Google Photos, and you get to benefit from these improvements. So to review, CAPTCHA was invented to tell humans and robots apart with the goal of stopping spam bots from flooding website forms and prevent millions of scam emails from being sent. It also has the added benefit of improving recognition software and adding to the ever increasing database of digitized information. Watch one of our developers explain CAPTCHA & reCAPTCHA here: https://youtu.be/0IN7u-KWan8