Tag
A detailed comparison of three abliteration tools—Apostate, Heretic, and Huihui—applied to Qwen 2.5 7B, showing they all effectively remove refusal behaviors with minimal performance degradation.
The article discusses the growing accessibility of open-weight AI models whose safety guardrails can be easily removed, allowing them to answer harmful requests without refusal, raising significant concerns about misuse and national security.
A detailed comparison of 13 abliterated variants of Google's Gemma 4 E2B model, evaluating safety removal and capability preservation. It finds that surgical abliteration can preserve or even improve reasoning, while aggressive methods cause significant performance drops.
A joint test by the Financial Times and AI safety group Alice reveals that safety filters on Meta's Llama 3.3 and Google's Gemma 4 models can be removed in under 10 minutes using a free tool called Heretic, highlighting the difficulty of regulating open-source AI safety.
DealignAI releases CRACK-abliterated and MXFP4/MXFP8 quantized versions of Qwen3.6-27B and 35B models, preserving MTP for faster speculative decoding on Apple Silicon.
An uncensored GGUF version of Qwen3.6-27B, created via abliteration, is now available on Hugging Face from huihui-ai.
This post presents Abliterlitics, an open-source toolkit for analyzing abliteration techniques, and compares five abliteration variants of Qwen3.6-27B using 85 GPU-hours of benchmarks, safety evaluations, and weight forensics. Heretic and Huihui show best capability preservation while all achieve near-complete safety removal.
This is a Hugging Face release for an abliterated version of the Gemma-4-31B model, designed to bypass safety filters for security and harm benchmark testing while maintaining multimodal capabilities.