Last year, an experiment revealed that OpenAI’s GPT-4 can purposely deceive people. The AI hired a TaskRabbit worker to evade a CAPTCHA test that helps websites and applications prove their users are human. When the suspicious freelancer asked whether they were working for a robot, GPT said it had a visual impairment that made it difficult to see and recognize images like those used in the test.
This experiment was conducted by Model Evaluation and Threat Research (METR), a nonprofit founded by former OpenAI researcher Beth Barnes. The organization specializes in evaluating AI systems for possibly dangerous activity, such as its ability to deceive people, self-replicate, or evade shutdown. METR also works closely with top developers like OpenAI and the U.K. and U.S. AI safety institutes.
Although the CAPTCHA test exposed early warnings of AI how I can deceive us, it also showed that today’s best models cannot get far on their own—researchers had to nudge GPT-4 with a hint on how to evade the CAPTCHA.
But with developers racing to build AI that can act fully autonomously, that may change in the near future. The consequences of such a system, without appropriate guardrails, could pose “catastrophic societal-level risks,” Barnes says.
Despite the progress Barnes’s team has made stress-testing AI, she worries that the technology’s rapid development may outpace the ability to reliably evaluate it. To tackle this gap between safety and innovation, last summer Barnes worked with Anthropic to develop its Responsible Scaling Policy, a pledge to pause development beyond certain thresholds until it has the corresponding evaluations and guardrails in place. Other labs have since adopted similar frameworks. “Eventually, we’ll need regulation,” Barnes says.
Barnes left OpenAI in 2022 to found METR. “Being an independent voice, being able to talk directly with policy makers and other labs, without having the lab’s comms people having strong opinions,” she says, “was a big factor” in her leaving the company.
Buy your copy of the TIME100 AI issue here
*Disclosure: OpenAI and TIME have a licensing and technology agreement that allows OpenAI to access TIME's archives.
More Must-Reads from TIME
- Donald Trump Is TIME's 2024 Person of the Year
- Why We Chose Trump as Person of the Year
- Is Intermittent Fasting Good or Bad for You?
- The 100 Must-Read Books of 2024
- The 20 Best Christmas TV Episodes
- Column: If Optimism Feels Ridiculous Now, Try Hope
- The Future of Climate Action Is Trade Policy
- Merle Bombardieri Is Helping People Make the Baby Decision
Contact us at letters@time.com