Millions Engage in Online Turing Test: Human or Not

Artificial intelligence (AI) has already proven its superiority over humans in passing those online “I am not a robot” CAPTCHA tests. But can AI pass the Turing test?

The Turing test, originally known as The Imitation Game, challenges participants to determine if they are conversing with a human or AI. In the traditional version, a human evaluator reads a text conversation between a human and an AI and must distinguish between the two.

In recent variations of the experiment, chatbots have been directly tested by having the evaluator engage in conversation with the AI. In a recent experiment conducted after the rise of large language model (LLM) chatbots like Chat GPT and Google Bard, over a million humans participated.

In this experiment, detailed in a preprint paper yet to be peer-reviewed, the researchers created an online game called “Human or Not.” Volunteers played the game and had to determine if they were interacting with an AI or a fellow human. The game randomly assigned them either a human player or an AI.

The team developed unique backstories for the AI bots, using various chat AIs. One character, Adam, was instructed to convince other users that he was them from the future. While this may not seem like something a human would do, it added to the challenge and fun of the game.

Within the first month, the experiment received over 10 million guesses from 1.5 million users, providing a substantial dataset for analysis.

Participants in the experiment paid close attention to typos and slang, assuming that these would not come from a machine. However, the AI chatbots were programmed to simulate these linguistic features. When humans tried to prove their humanity, they often used more slang, typos, and answered personal and emotional questions.

“A more distinctive approach involved using foul language, expressing controversial views, and posing questions that AI bots tend to avoid answering,” the researchers wrote. They found that the use of rude language correctly identified a human player 86.7 percent of the time.

Overall, humans correctly identified their partners’ identities in only 68 percent of the games. When playing against a chatbot, they guessed correctly 60 percent of the time, while they accurately identified a human companion 73 percent of the time.

The team acknowledges the limitations of the test, including the game’s nature potentially raising suspicions among participants and influencing their conversation strategies. Nevertheless, the experiment provided valuable insights into the strategies humans employ to distinguish between humans and machines.

The paper is available on the preprint server arXiv.