safety-research

Tag

Cards List
#safety-research

What matters when synthetic training data is generated on demand?

Reddit r/ArtificialInteligence · 2026-05-14

Abliteration launches a made-to-order synthetic training data workflow that generates negative, rare, and adversarial examples for classifiers, with schema, real-world facts, labels, provenance, and export to platforms like Hugging Face.

0 favorites 0 likes
#safety-research

Steered LLM Activations are Non-Surjective

Hugging Face Daily Papers · 2026-05-07 Cached

This paper proves that activation steering in LLMs produces internal states that cannot be replicated by any textual prompt, establishing a formal separation between white-box steerability and black-box prompting.

0 favorites 0 likes
#safety-research

Expanding on how Voice Engine works and our safety research

OpenAI Blog · 2024-06-07 Cached

OpenAI details the development history and safety approach for Voice Engine, from internal testing in 2022 through various limited deployments including ChatGPT Voice Mode and TTS API, emphasizing careful rollout with professional voice actors and ongoing collaboration with policymakers to address synthetic voice risks.

0 favorites 0 likes
← Back to home

Submit Feedback