Tag
The thread critiques AI leadership for centralizing control under safety rhetoric, drawing parallels to 1990s encryption export restrictions. It argues that sanctions against China have accelerated its domestic chip and AI development, potentially leading to geopolitical escalation and fragmentation of the global software ecosystem.
OpenAI introduced 'safe completions,' a new safety-training approach in GPT-5 that replaces binary refusal-based training with output-centric rewards, improving both safety and helpfulness—especially for dual-use prompts. The method penalizes unsafe outputs and rewards helpful responses, resulting in fewer and less severe safety violations compared to refusal-trained models like o3.
OpenAI publishes a comprehensive approach to managing dual-use risks from advanced AI models in biology, outlining strategies for enabling beneficial scientific discovery while preventing misuse for bioweapons development through expert collaboration, model training, detection systems, and security controls.
OpenAI co-authors a comprehensive paper forecasting malicious uses of AI and proposing mitigation strategies, developed in collaboration with leading research institutions. The work emphasizes acknowledging AI's dual-use nature, learning from cybersecurity practices, and broadening stakeholder discussions around AI security risks.