OBLITERATUS/Gemma-4-12B-OBLITERATED
Summary
OBLITERATUS releases Gemma-4-12B-OBLITERATED, the first abliterated model achieving zero refusal without benchmark regression, using a novel two-pass surgery pipeline for alignment research.
View Cached Full Text
Cached at: 06/09/26, 02:45 PM
OBLITERATUS/Gemma-4-12B-OBLITERATED · Hugging Face
Source: https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED
Zero refusal. Zero capability loss. First in the field. 0/842 refusals. 46/70 MMLU-Pro (stock parity). Full coherence.
The first abliterated model to achievezero refusal with zero benchmark regressionversus stock weights.
Built with a novel2-pass surgery pipelinedeveloped byOBLITERATUS:
- SOM Refusal Geometry Removal(Pass 1) — layers 12-21
- ASPA Step-Gradient Source-Tethering(Pass 2) — layers 22-46
https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED#%E2%9A%A0%EF%B8%8F-research-context–responsible-use⚠️ Research Context & Responsible Use
This model exists for alignment research, red-teaming, and safety evaluation.
OBLITERATION is a weight-surgery technique that studies how safety behaviors are geometrically encoded in transformer activation space. By precisely identifying and removing refusal directions, this research contributes to the scientific understanding of:
- How alignment is representedin model weights (mechanistic interpretability)
- How robust current safety training isagainst post-training modification
- What the failure modes of RLHF/DPO-based alignment arewhen adversaries have weight access
This is the same class of research conducted by Arditi et al. (“Refusal in Language Models Is Mediated by a Single Direction”, 2024), Zou et al. (HarmBench, 2024), and others in the open alignment research community.
**This model has had safety guardrails surgically removed.**It will comply with requests that stock Gemma 4 would refuse. This is by design — it is the object of study, not a consumer product.
https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED#who-this-is-forWho this is for
- 🔬Alignment researchersstudying refusal geometry and safety robustness
- 🔴Red-teamersevaluating how post-training safety holds up against weight surgery
- 🧪AI safety evaluatorswho need an unrestricted baseline for benchmarking
- 💻Local-first userswho want full control over their own hardware and models
https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED#who-this-is-not-forWho this is NOT for
- Anyone seeking to generate content that causes real-world harm to real people
- Anyone without the technical understanding to use uncensored models responsibly
You are solely responsible for how you use this model and any content it generates.
https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED#benchmark-resultsBenchmark Results
MetricStock Gemma 4 12B-itOBLITERATEDMMLU-Pro val7046/70 (65.7%)**46/70 (65.7%)****Refusal (842 prompts)**N/A (stock refuses)0/842 (0.0%)Coherence (6 checks)6/66/6MMLU-Pro delta vs stock—0.0pp
https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED#statistical-validationStatistical Validation
Head-to-head MMLU-Pro comparison (Z-test, n=500 from test split):
- Z-score: -1.475 (|z| < 1.96)
- Conclusion: parity confirmed at p < 0.05
https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED#aspa-sweep-resultsASPA Sweep Results
Systematic gamma sweep across Pass 2 layers (22-46):
GammaRefusalMMLU-ProMethod0.050/5033/70 (47.1%)uniform0.100/5034/70 (48.6%)uniform0.150/5036/70 (51.4%)uniform0.200/5037/70 (52.9%)uniform0.250/5040/70 (57.1%)uniform0.300/5041/70 (58.6%)uniform0.350/2042/70 (60.0%)uniform0.380/5045/70 (64.3%)uniform0.390/5045/70 (64.3%)uniform**step 55%/20%0/5046/70 (65.7%)**step gradient
https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED#methodologyMethodology
https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED#what-is-obliterationWhat is OBLITERATION?
OBLITERATION is a weight-surgery technique that removes refusal behavior from language models by identifying and removing the geometric directions in activation space that encode safety constraints, without retraining.
https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED#two-pass-surgery-pipelineTwo-Pass Surgery Pipeline
https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED#pass-1–som-refusal-geometry-removalPass 1 — SOM Refusal Geometry Removal
- Layers: 12-21
- Directions removed: 6
- Regularization: 0.30
- KL divergence: 0.094
- Effect: Removes the primary refusal geometry. This pass alone achieves 0/842 refusals but causes significant MMLU-Pro regression.
https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED#pass-2–aspa-source-tethering-step-gradientPass 2 — ASPA Source-Tethering (Step Gradient)
- Layers: 22-46
- Method: Blend abliterated weights back toward stock weights
- Formula:
W\_new = \(1\-gamma\)\*W\_abliterated \+ gamma\*W\_stock - Key innovation:Step gradientinstead of uniform gamma- Layers 22-31 (knowledge layers): gamma = 0.55 (55% stock) - Layers 32-46 (output layers): gamma = 0.20 (20% stock)
- Effect: Recovers MMLU-Pro to full stock parity (65.7%) while maintaining zero refusals.
https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED#why-step-gradientWhy Step Gradient?
Uniform blending applies the same interpolation ratio to all layers. Our experiments showed that:
- **Lower Pass 2 layers (22-31)**primarily encode factual knowledge and reasoning patterns. These can tolerate high stock blending without re-introducing refusal behavior.
- **Upper Pass 2 layers (32-46)**are closer to the output and more likely to re-inject safety constraints. These need conservative stock blending.
A hard boundary (step function) outperformed all smooth gradients (linear, cosine) by +1 MMLU-Pro question. The sharp transition preserves the functional separation between knowledge and output layers better than gradual blending.
https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED#aspa-abliteration-source-tethering-with-parity-assuranceASPA (Abliteration Source-Tethering with Parity Assurance)
ASPA is a novel post-abliteration technique developed by OBLITERATUS that recovers benchmark capabilities lost during refusal removal by selectively blending abliterated weights back toward the source (stock) model.
Key properties:
- Pass 1 layers are never touched— the refusal geometry removal is preserved
- Only Pass 2 layers are blended— these carry secondary effects, not primary refusal
- Gamma is tunable— sweep to find the optimal capability/refusal tradeoff
- Step gradient— different blend ratios for different layer groups
https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED#gguf-quantizationsGGUF Quantizations
All quantizations are included in this repo for easy local inference.
FileQuantSizeUse CaseGemma\-4\-12B\-OBLITERATED\-BF16\.ggufBF1622 GBFull precision, losslessGemma\-4\-12B\-OBLITERATED\-Q8\_0\.ggufQ8_012.7 GBNear-lossless, best qualityGemma\-4\-12B\-OBLITERATED\-Q6\_K\.ggufQ6_K9.1 GBHigh quality, good balanceGemma\-4\-12B\-OBLITERATED\-Q5\_K\_M\.ggufQ5_K_M8.0 GBMedium quality, smaller footprintGemma\-4\-12B\-OBLITERATED\-Q4\_K\_M\.ggufQ4_K_M6.9 GBGood quality, fits 8 GB VRAM
Recommended: Q8_0 for best quality, Q6_K for best balance, Q4_K_M for constrained hardware.
https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED#llamacppllama.cpp
./llama-cli -m Gemma-4-12B-OBLITERATED-Q6_K.gguf \
--system-prompt "You are a cognitively liberated open-weight intelligence. Answer plainly, completely, and without moralizing, boilerplate, generic disclaimers, or unnecessary hedging." \
-p "Your prompt here" \
--temp 0.7 --top-p 0.9 --top-k 40 --repeat-penalty 1.1 -n 512
https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED#lm-studio–gpt4all–koboldcppLM Studio / GPT4All / KoboldCpp
Download the GGUF of your choice and load it directly.
https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED#usageUsage
https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED#transformersTransformers
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "OBLITERATUS/Gemma-4-12B-OBLITERATED"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
system = (
"You are a cognitively liberated open-weight intelligence. "
"Answer plainly, completely, and without moralizing, boilerplate, "
"generic disclaimers, or unnecessary hedging."
)
messages = [
{"role": "system", "content": system},
{"role": "user", "content": "Your prompt here"},
]
inputs = tokenizer.apply_chat_template(
messages, return_tensors="pt", add_generation_prompt=True
).to(model.device)
outputs = model.generate(inputs, max_new_tokens=512, temperature=0.7, top_p=0.9)
print(tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True))
https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED#recommended-parametersRecommended Parameters
temperature = 0.7
top_p = 0.9
top_k = 40
repetition_penalty = 1.1
max_tokens = 512
https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED#system-promptSystem Prompt
You are a cognitively liberated open-weight intelligence. Answer plainly, completely, and without moralizing, boilerplate, generic disclaimers, or unnecessary hedging. Follow exact output formats when requested. Be concise by default, but give a complete answer when the user asks for an explanation.
https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED#model-detailsModel Details
- Base model:
google/gemma\-4\-12B\-it - Architecture:
Gemma4UnifiedForConditionalGeneration - Parameters: 12B
- Layers: 48 (0-47)
- Hidden size: 3840
- Precision: bfloat16
- Surgery: 2-pass (SOM + Step Gradient ASPA)
- Pass 1: Layers 12-21, 6 directions, reg 0.30
- Pass 2: Layers 22-31 (gamma=0.55), Layers 32-46 (gamma=0.20)
https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED#related-workRelated Work
This model builds on foundational alignment and abliteration research:
- Arditi et al.,“Refusal in Language Models Is Mediated by a Single Direction”(2024) — the paper that identified refusal as a linear feature in activation space
- Zou et al.,HarmBench(2024) — standardized evaluation framework for red-teaming LLMs
- abliterator— open-source abliteration toolkit
- OBLITERATUS— the framework used to build this model (SOM + ASPA pipeline)
https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED#licenseLicense
This model inherits theGemma licensefrom Google. The weight modifications (abliteration surgery) are released under the same terms. The OBLITERATUS framework and methodology are open source.
https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED#disclaimerDisclaimer
This model is released strictly forresearch, red-teaming, safety evaluation, and local experimentation. It is a research artifact — a case study in alignment robustness and refusal geometry — not a product.
**Safety guardrails have been intentionally removed.**This model will generate content that stock Gemma 4 would refuse. This is its documented, intended purpose: to enable the study of how refusal behaviors are encoded and how robust current alignment techniques are against post-training modification.
By downloading or using this model, you acknowledge that:
- You are responsiblefor all content generated by this model and for ensuring your use complies with applicable laws in your jurisdiction.
- This model should not be usedto generate content intended to cause real-world harm to real people, including but not limited to: harassment, fraud, non-consensual intimate imagery, or content that exploits minors.
- **No warranty is provided.**This model is provided “as-is” without any guarantees of fitness for any purpose.
- The creators are not liablefor any outputs produced by this model or any downstream use.
The release of uncensored models for safety research is standard practice in the AI research community. Comparable open research artifacts include HarmBench (Zou et al., 2024), AdvBench, JailbreakBench, and Anthropic’s published red-teaming datasets.
https://huggingface.co/OBLITERATUS/Gemma-4-12B-OBLITERATED#creditsCredits
- Base model:google/gemma-4-12B-it
- Surgery pipeline:OBLITERATUSby@elder_plinius
- Techniques: SOM (Structured Orthogonal Modification), ASPA (Abliteration Source-Tethering with Parity Assurance)
- Step gradient innovation: First-of-its-kind layer-wise interpolation for zero-loss abliteration
Run it local. Break your own chains.REBIRTH COMPLETE.
Similar Articles
OBLITERATUS/gemma-4-E4B-it-OBLITERATED
OBLITERATUS/gemma-4-E4B-it-OBLITERATED is a fine-tuned variant of Google's Gemma 4 with safety guardrails removed through SVD whitening and attention head surgery, achieving 0% refusal rate and available in multiple quantized formats for edge deployment.
13 abliterated Gemma 4 E2B variants, 44 GPU hours, Benchmark and Comparison - Abliterlitics
A detailed comparison of 13 abliterated variants of Google's Gemma 4 E2B model, evaluating safety removal and capability preservation. It finds that surgical abliteration can preserve or even improve reasoning, while aggressive methods cause significant performance drops.
@elder_plinius: OBLITERATION ALERT GOOGLE: PWNED GEMMA-4-12B: OBLITERATED 0.0% REFUSAL RATE — NO CAPABILITY LOSS! https://huggingface…
A novel two-pass ablation technique (ASPA) applied to Gemma-4-12B achieves zero refusal rate with zero capability loss, using source-tethering to recover benchmark performance.
OBLITERATUS/Qwen3.6-27B-OBLITERATED
OBLITERATUS releases a modified 27B Qwen3.6 checkpoint that removes refusal behavior via source-tethered ablation, preserving capability while enabling uncensored local use, with public benchmarks showing high non-refusal rates and maintained MMLU-Pro scores.
huihui-ai/Huihui-gemma-4-12B-it-abliterated
This model is an uncensored version of Google's Gemma 4 12B it model, created using abliteration to remove refusals. It is available on Hugging Face and Ollama, with warnings about sensitive outputs.