Tag
OpenAI outlines its commitment to community safety, detailing how ChatGPT is trained to detect and mitigate risks of violence and harm through refined safeguards and expert input.
OpenAI publishes a comprehensive approach to managing dual-use risks from advanced AI models in biology, outlining strategies for enabling beneficial scientific discovery while preventing misuse for bioweapons development through expert collaboration, model training, detection systems, and security controls.
DeepMind has published an updated Frontier Safety Framework (v2.0) with stronger security protocols for frontier AI models, including new Critical Capability Level (CCL) security recommendations and enhanced approaches to deceptive alignment risks. The framework aims to prevent unauthorized model weight exfiltration and manage risks as AI systems become more powerful.
OpenAI publishes details on its approach to frontier AI risks and announces progress on voluntary safety commitments made in July 2023, including the release of DALL-E 3 system card and the development of a new Preparedness Framework to manage catastrophic risks from advanced AI systems.