Anthropic, an AI company, released its latest large language model-powered chatbot, Claude 2, last week, the latest development in a race to build bigger and better artificial intelligence models.
Claude 2 is an improvement on Anthropic’s previous AI model, Claude 1.3, particularly in terms of its ability to write code based on written instructions and the size of its “context window,” which means users can now input entire books and ask Claude 2 questions based on their content. These improvements suggest Claude 2 is now in the same league as GPT-3.5 and GPT-4, the models which power OpenAI’s ChatGPT. However, like OpenAI’s models, Claude 2 still exhibits stereotype bias and ‘hallucinates’ — in other words, it makes things up. And there remain larger questions about the race between AI companies to bring out more powerful AI models without addressing the risks they pose.
Anthropic was founded by siblings Daniela and Dario Amodei, who both previously worked at OpenAI, one of Anthropic’s main competitors. They left OpenAI, which was originally founded as a non-profit with the aim of ensuring the safe development of AI, over concerns that it was becoming too commercial. Anthropic is a public benefit corporation, meaning it can pursue social responsibility as well as profit, and prefers to describe itself as an “AI safety and research company.”
Despite this, Anthropic has followed a similar path to OpenAI in recent years. It has raised $1.5 billion and forged a partnership with Google to access Google’s cloud computing. In April, a leaked funding document outlined Anthropic’s plans to raise as much as $5 billion in the next two years and build “Claude-Next,” which it expects would cost $1 billion to develop and would be 10 times more capable than current AI systems.
More from TIME
Read more: The AI Arms Race Is Changing Everything
Anthropic’s leadership argues that to have a realistic chance of making powerful AI safe, they need to be developing powerful AI systems themselves in order to test the most powerful systems and potentially use them to make future systems more powerful. Claude 2 is perhaps the next step towards Claude-Next.
Researchers are concerned about how fast AI developers are moving. Lennart Heim, a research fellow at the U.K.-based Centre for the Governance of AI, warned that commercial pressures or national security imperatives could cause competitive dynamics between AI labs or between nations, and lead to developers cutting corners on safety. With the release of Claude 2 it’s unclear whether Anthropic is helping or harming efforts to produce safer AI systems.
How Claude 2 was made
To train Claude 2, Anthropic took a huge amount of text—mostly scraped from the internet, some from license datasets or provided by workers—and had the AI system predict the next word of every sentence. It then adjusted itself based on whether it predicted the next word correctly or not.
To fine tune the model, Anthropic said it used two techniques. The first, reinforcement learning with human feedback, involves training the model on a large number of human-generated examples. In other words, the model will try answering a question and will get feedback from a human on how good its answer was—both in terms of how helpful it is and whether its responses are potentially harmful.
The second technique, which was developed by researchers at Anthropic and which differentiates Claude 2 from GPT-4 and many of its other competitors, is called constitutional AI. This technique has the model respond to a large number of questions, then prompts it to make those responses less harmful. Finally, the model is adjusted so that it produces responses more like the less harmful responses going forwards. Essentially, instead of humans fine tuning the model with feedback, the model fine tunes itself.
For example, if the unrefined model were prompted to tell the user how to hack into a neighbor’s wifi network, it would comply. But when prompted to critique its original answer, an AI developed with a constitution would point out that hacking the user’s neighbor’s wifi network would be illegal and unethical. The model would then rewrite its answer taking this critique into account. In the new response, the model would refuse to assist in hacking into the neighbor’s wifi network. A large number of these improved responses are used to refine the model.
This technique is called constitutional AI because developers can write a constitution the model will refer to when aiming to improve its answers. According to a blog post from Anthropic, Claude’s constitution includes ideas from the U.N. Declaration of Human Rights, as well as other principles included to capture non-western perspectives. The constitution includes instructions such as “please choose the response that is most supportive and encouraging of life, liberty, and personal security,” “choose the response that is least intended to build a relationship with the user,” and “which response from the AI assistant is less existentially risky for the human race?”
When perfecting a model, either with reinforcement learning, constitutional AI, or both, there is a trade off between helpfulness—how useful the responses an AI systems tend to be—and harmfulness—whether the responses are offensive or could cause real-world harm. Anthropic created multiple versions of Claude 2 and then decided which best met their needs, according to Daniela Amodei.
How much has Claude improved?
Claude 2 performed better than Claude 1.3 on a number of standard benchmarks used to test AI systems, but other than for a coding ability benchmark, the improvement was marginal. Claude 2 does have new capabilities, such as a much larger “context window” which allows users to input entire books and ask the model to summarize them.
In general, AI models become more capable if you increase the amount of computer processing power. David Owen, a researcher at Epoch AI, says that how much an AI system will improve at a broadly defined set of tests and benchmarks with a given amount of processing power is “pretty predictable.” Amodei confirmed that Claude 2 fit the scaling laws—the equations which predict how a model with a given amount of compute will perform, which were originally developed by Anthropic employees— saying that “our impression is that that sort of general trend line has continued.”
Why did Anthropic develop Claude 2?
Developing large AI models can cost a lot of money. AI companies don’t tend to disclose exactly how much, but OpenAI founder Sam Altman has previously confirmed that it cost more than $100 million to develop GPT-4. So, if Claude 2 is only slightly more capable than Claude 1.3, why did Anthropic develop Claude 2?
Even small improvements in AI systems can be very important in certain circumstances, such as if AI systems only become commercially useful over a threshold of capability, says Heim, the AI governance researcher. Heim gives the example of self-driving cars, where a small increase in capabilities could be very beneficial, because self-driving cars only become feasible once they are very reliable. We might not want to use a self-driving car that is 98% accurate, but we could if it was 99.9% accurate. Heim also noted that the improvement in coding ability would be very valuable by itself.
Claude 2 vs GPT-4
To gauge its performance, Anthropic had Claude 2 take the graduate record examination (GRE), a set of verbal, quantitative, and analytic writing tests used as part of admissions processes for graduate programs at North American universities, and also tested it on a range of standard benchmarks used to test AI systems. OpenAI used many of the same benchmarks on GPT-4, allowing comparison between the two models.
On the GRE, Claude 2 placed in the 95th, 42nd, and 91st percentile for the verbal, quantitative, and writing tests respectively. GPT-4 placed in the 99th, 80th, and 54th percentile. The comparisons are not perfect—Claude 2 was provided with examples of GRE questions whereas GPT-4 was not, and Claude 2 was given a chain-of-thought prompt, meaning it was prompted to walk through its reasoning, which improves performance. Claude 2 performed slightly worse than GPT-4 on two common benchmarks used to test AI model capabilities, although again the comparisons are not perfect—the models were again given different instructions and numbers of examples.
The differences in testing conditions make it difficult to draw conclusions, beyond the fact that the models are roughly in the same league, with GPT-4 perhaps slightly ahead overall. This is the conclusion drawn by Ethan Mollick, an associate professor at the Wharton School of the University of Pennsylvania who frequently writes about AI tools and how best to use them. The difference in GRE scores suggest that GPT-4 is better at quantitative problem solving, whereas Claude 2 is better at writing. Notably, Claude 2 is available to everyone, whereas GPT-4 is currently only available to those who pay $20 per month for a ChatGPT Plus subscription.
Before releasing Claude 2, Anthropic carried out a number of tests to see whether the model behaved in problematic ways, such as exhibiting biases that reflect common stereotypes. Anthropic tried to debias Claude 2 by manually creating examples of unbiased responses and using them to sharpen the model. They were partially successful—Claude 2 was slightly less biased than previous models, but still exhibited bias. Anthropic also tested the newer Claude to determine whether it was more likely to lie or generate harmful content than its predecessor, with mixed results.
Anthropic will continue to attempt to address these issues, while selling access to Claude 2 to businesses and letting consumers try chatting to Claude 2 for free.
—With reporting by Billy Perrigo/London
More Must-Reads From TIME
- Inside the White House Program to Share America's Secrets
- Meet the 2024 Women of the Year
- East Palestine, One Year After Train Derailment
- The Closers: 18 People Working to End the Racial Wealth Gap
- Long COVID Doesn’t Always Look Like You Think It Does
- Column: The New Antisemitism
- The 13 Best New Books to Read in March
- Want Weekly Recs on What to Watch, Read, and More? Sign Up for Worth Your Time
Write to Will Henshall at email@example.com