SearchAbout
Safety & PolicyTopic

Responsible AI

Every story we’ve tagged Responsible AI.

Security vulnerability reports have exploded since AI models started hunting for bugs
Security

Security vulnerability reports have exploded since AI models started hunting for bugs

Anthropic's Claude Mythos Preview model has led to a surge in reported security vulnerabilities, with over 1,500 high-severity and critical vulnerabilities reported in June 2026. This follows Anthropic's announcement in April 2026 that its model can find software vulnerabilities on its own.

The Decoder3 min read12h agoSign in to upvoteSign in to save
UK parents warned over posting images of children amid AI sexual abuse fears
Safety & Policy

UK parents warned over posting images of children amid AI sexual abuse fears

The UK National Crime Agency and Internet Watch Foundation have warned parents about posting images of their children online due to the increasing risk of AI-generated child sexual abuse material. The guidance advises parents to make their social media accounts private and be cautious about sharing images of their children online.

The Guardian — AI13 min read17h agoSign in to upvoteSign in to save
AI prey: why watchdogs are telling parents to protect children from nudification apps
Safety & Policy

AI prey: why watchdogs are telling parents to protect children from nudification apps

The UK's National Crime Agency and the Internet Watch Foundation have warned parents to protect their children's images online due to the rise of AI-generated child sexual abuse material. The amount of such material found online surged by 14% in 2025. Guidance has been issued to help parents mitigate the risk.

The Guardian — AI13 min read17h agoSign in to upvoteSign in to save
Jul 2, 2026 Announcements More details on Fable 5’s cyber safeguards and our jailbreak framework
Safety & Policy

Jul 2, 2026 Announcements More details on Fable 5’s cyber safeguards and our jailbreak framework

Anthropic has provided more information on Fable 5's cybersecurity safeguards and proposed a framework for evaluating the severity of AI jailbreaks. The company aims to balance preventing misuse with allowing defensive uses of the technology. This move is part of a broader effort to establish industry standards for AI safety.

Anthropic News26 min read1d agoSign in to upvoteSign in to save
‘Don’t kill music’: Anthony Albanese’s favourite bands beg PM to stop AI companies from stealing their work
Safety & Policy

‘Don’t kill music’: Anthony Albanese’s favourite bands beg PM to stop AI companies from stealing their work

The Australian government is considering a proposal to weaken copyright laws, allowing tech companies to use Australian musicians' work to improve AI models. Musicians, including those favored by Prime Minister Anthony Albanese, are speaking out against the proposal, citing concerns about fair compensation and the impact on the music industry.

The Guardian — AI12 min read1d agoSign in to upvoteSign in to save
Anthropic says it cut 80 percent of Claude Code's system prompt because Fable 5 models "want a smaller system prompt"
Launches

Anthropic says it cut 80 percent of Claude Code's system prompt because Fable 5 models "want a smaller system prompt"

Anthropic reduced Claude Code's system prompt by 80 percent due to the new Fable 5 models' preference for smaller prompts. This change marks a shift in how AI models are steered, with a focus on context over hard rules. The update allows for more imaginative responses.

The Decoder2 min read1d agoSign in to upvoteSign in to save
Best practices for multi-turn reinforcement learning in Amazon SageMaker AI
Research

Best practices for multi-turn reinforcement learning in Amazon SageMaker AI

Amazon SageMaker AI shares best practices for multi-turn reinforcement learning, including environment design and reward alignment. The goal is to improve the reliability of agentic RL training. These practices draw from the SOP-Bench dataset and focus on trustworthy environments and evaluation.

AWS Machine Learning Blog28 min read1d agoSign in to upvoteSign in to save
More compute, more capability: Why AI agent evaluations need to account for test-time compute
Safety & Policy

More compute, more capability: Why AI agent evaluations need to account for test-time compute

AISI's research highlights the importance of accounting for test-time compute in AI agent evaluations, as fixed budgets can underestimate capabilities, especially for newer models. Increasing compute can improve performance, and the benefits are more significant for more advanced models.

UK AI Safety Institute13 min read2d agoSign in to upvoteSign in to save
You Can Now Sound the Alarm on AI Behaving Badly
Safety & Policy

You Can Now Sound the Alarm on AI Behaving Badly

A new platform called FLARE-AI has been launched to allow users to report harmful AI behavior and model flaws, aiming to improve transparency and accountability in AI development. The platform was discussed in an article published by WIRED, with input from CSET's Jessica Ji. This launch is seen as a positive step towards making AI more transparent.

CSET (Georgetown)4 min read2d agoSign in to upvoteSign in to save
You’re all caught up.