Jeff Bezos popularized the use of mechanical Turks, low-paid workers who worked remotely with perhaps thousands of other people on small parts of larger computer projects, at the turn of the century, to ensure a human perspective on mostly straightforward tasks that computers found difficult. He referred to this amalgamation of digital and human intelligence as “artificial artificial intelligence.”
One of many providers of such services, Amazon’s Mechanical Turk marketplace employs approximately a quarter million individuals.
According to a report released this week by researchers at EPFL, a university in Switzerland, Turks who had previously relied on significant human input are now using AI-generated content to complete their tasks. This phenomenon was referred to as “artificial artificial intelligence” by them.
Although the term may make people smile, the findings, according to the researchers, raise serious concerns.
“Large language models are becoming increasingly popular, as are multimodal models that support not only text but also image and video input and output.”
Veniamin Veselovsky
According to researcher Veniamin Veselovsky, workers using AI generators to complete their tasks “would severely diminish the utility of crowdsourced data.” The publication titled “Artificial Artificial Intelligence: The Paper, titled “Crowd Workers Widely Use Large Language Models for Text Production Tasks,” was published on June 13 on the arXiv preprint server.
Although large language models do a great job of processing training data, certain tasks still require human intervention. More effectively than computers, humans label data entered into models, describe images, and respond to CAPTCHA screens.
According to Veselovsky, “it is tempting to rely on crowdsourcing to validate large language model outputs or to produce human gold-standard data for comparison.” However, what if crowd workers themselves are utilizing LLMs to boost their productivity and, consequently, income on crowdsourcing platforms?”
If left unchecked, such input could cast doubt on the dependability of AI-based operations and contaminate the data pool.
An 18th-century chess master’s “robot,” who defeated players across Europe, is the source of the term “turk.” Benjamin Franklin and Napoleon were among the defeated. Under the machine’s planks, a human chess expert was hidden from the players’ view.
Modern-day Turkish crowdsourcing is now a billion-dollar industry. The infamously low wages that some businesses pay their employees have tarnished their reputation. Turks can make between $2 and $5 per hour.
The industry, on the other hand, is in jeopardy due to the sudden adoption of large language models. A recent study found that when tackling classification tasks, a ChatGPT 3.5 turbo model performed significantly better than crowd workers at about one-twentieth the cost.
Workers may increase their reliance on AI resources as a result of increased pressure to produce more and do so faster.
The EPFL researchers estimated that 33% to 46% of worker assignments were completed with the assistance of large language models based on a limited study of the use of large language models by workers at Amazon’s crowdsourcing operation MTurk.
According to Veselovsky, “multimodal models, supporting not only text but also image and video input and output, are on the rise.” “Large language models are becoming more and more popular by the day.” Our findings ought to be regarded as the “canary in the coal mine” that ought to serve as a wake-up call to platforms, researchers, and crowd workers to discover novel strategies for safeguarding human data.”
More information: Veniamin Veselovsky et al, Artificial Artificial Artificial Intelligence: Crowd Workers Widely Use Large Language Models for Text Production Tasks, arXiv (2023). DOI: 10.48550/arxiv.2306.07899