Artificial intelligence (AI) company Anthropic is launching a new program to improve how we measure the abilities of AI systems. Announced on Monday, this program will provide funding to outside groups that develop new ways to assess AI performance.
These new methods, according to Anthropic, should be able to effectively measure the “advanced capabilities” of AI models. Organizations interested in participating can submit applications for review at any time.
Anthropic wrote on its official blog:
Our investment in these evaluations is intended to elevate the entire field of AI safety, providing valuable tools that benefit the whole ecosystem. Developing high-quality, safety-relevant evaluations remains challenging, and the demand is outpacing the supply.
There’s a problem with how we currently test AI systems. Most tests don’t reflect how people use AI in the real world. Some tests, especially older ones designed before powerful generative AI existed, might not even measure what they’re supposed to.
Anthropic’s solution is to develop entirely new testing methods. These new tests will be more challenging and focus on how secure AI is and how it might affect society. To achieve this, Anthropic plans to create new tools, resources, and testing methods.
Anthropic is concerned that current AI testing methods don’t consider some of the potential risks with powerful AI. Their new program will focus on developing tests that can identify these risks.
For example, some of the new tests will see if an AI model could be used for malicious purposes, like cyberattacks or creating weapons. They’ll also look for the potential of AI to manipulate people through fake news or other deceptive tactics.
National security is another area of concern. While they don’t reveal exactly how it would work, Anthropic wants to develop an “early warning system” to identify potential threats from AI before they happen.
The program will also fund research into using AI for good. This includes exploring AI’s ability to help with scientific research, translate languages, and reduce bias in decision-making. Additionally, they want to develop AI that can identify and avoid generating harmful or offensive content.