AI’s Anthropic Flags Signal Threat of Sophisticated Cyber Attacks

SeniorTechInfo
2 Min Read
aiattack5gettyimages-1895498673

Just_Super/Getty Images

Anthropic, the creator of the Claude family of large language models, recently updated its safety controls policy for its software, acknowledging the potential for malicious actors to exploit AI models for cyber attacks.

The company’s “responsible scaling policy” PDF document outlines various procedural changes to monitor risks of AI model misuse. It introduces AI Safety Level Standards (ASL) with escalating risk levels and technical safeguards.

During routine testing, Anthropic discovered a capability within its AI models that could automate sophisticated cyber attacks, necessitating further investigation and stronger safeguards.

The report highlights measures to address this capability, involving experts in cyber operations and the possibility of tiered access controls for advanced AI models with cyber capabilities.

Currently, all of Anthropic’s AI models must meet ASL “level 2” requirements, which include security systems to thwart most opportunistic attackers.

These policy updates align with Anthropic and OpenAI’s commitment to self-regulate AI technologies. The collaboration with the US Artificial Intelligence Safety Institute underscores their dedication to research and testing AI.

The threat of AI automating cyber attacks has been recognized, with incidents such as state actors attempting to compromise AI models for malicious purposes.

Share This Article
Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *