Anthropic Lays Out Strategy To Curb Evil AI – The Messenger

Taking cues from how the U.S. government handles bioweapons, artificial intelligence startup Anthropic has laid out a risk assessment and mitigation strategy designed to identify and curb AI before it causes catastrophe.

The Responsible Scaling Policy offers a four-tier system to help judge an AI's risk level on the low end are models that can't cause any harm whatsoever and on the high end are models that don't even exist but which hypothetically could achieve a malignant superintelligence capable of acting autonomously and with intent.

"As AI models become more capable, we believe that they will create major economic and social value, but will also present increasingly severe risks," the company said in a blog post on Tuesday. The policy, they clarified, is focused on "catastrophic risks those where an AI model directly causes large scale devastation."

The four tiers range from AI like that which powers a gaming app, for example, a computer that plays chess. The second tier contains models that can be used by a human to cause harm, like ChatGPT, for example, to create and spread disinformation. The third tier escalates the risk posed by the second tier models these models might offer information to users not found on the internet as we know it, and they could become autonomous in some degree.

The highest tier, though, are hypothetical. But Anthropic speculated they could eventually produce "qualitative escalations in catastrophic misuse potential and autonomy."

Anthropic's policy gives users a blueprint to contain the models once they've diagnosed the extent of their problem.

The first tier are so benign, that these models require no extra strategy or planning, Anthropic said.

For models that fall under in the second tier, Anthropic recommends similar safety guidelines to those adopted as part of a White House-led Commitment in July: AI programs should be tested thoroughly before they are released into the wild, and AI companies need to tell governments and the public about the risks inherent to their models. They also need to remain vigilant against cyberattacks and manipulation or misuse.

Containing the third tier-class AI takes this further: They require companies to securely store their AI models on servers, and maintain strict need-to-know protocols for employees working on different facets of the models. Anthropic also recommends models be kept in secure locations and that whatever hardware was used to design the programs also be kept secure.

Perhaps because it is still hypothetical, Anthropic has no guidance for the advent of an evil, fourth-tier AI system.

"We want to emphasize that these commitments are our current best guess, and an early iteration that we will build on," Anthropic said in the post.

Anthropic was founded in 2021 by former members of ChatGPT creator OpenAI. The company has raised more than $1.6 billion in funding and is perhaps best known for its Claude chatbot.

A spokesperson for Anthropic did not immediately reply to a request for comment.

See the article here:
Anthropic Lays Out Strategy To Curb Evil AI - The Messenger

Related Posts

Comments are closed.