Gryphe/Pantheon-Reasoning-27B · Hugging Face

Reddit r/LocalLLaMA 05/30/26, 09:56 AM Models

model-release reasoning roleplay qwen uncensored huggingface fine-tuning

Summary

Gryphe releases Pantheon-Reasoning-27B, an uncensored dense Qwen 3.6 27B model fine-tuned with reasoning traces for enhanced roleplay and narrative generation. It combines roleplay data with full thinking traces to improve character immersion and narrative planning.

from Gryphe: An experiment in bringing reasoning capability to the Pantheon roleplay series in the form of an uncensored dense Qwen 3.6 27B. This specific model can be thought of as a successor to both the Pantheon series and the one-time Codex release since I used such a large variety of data this time around. Yet another theory being tested this time around: take the data that Pantheon is built on, pair it with full thinking traces, and let the model reason its way through character work — weighing tone, planning narrative beats, considering how a character would actually respond before committing to a line. Whether that meaningfully improves roleplay quality over a non-reasoning model is a question you'll hopefully be able to help me answer. GGUF quants [are available here](https://huggingface.co/bartowski/Gryphe_Pantheon-Reasoning-27B-GGUF). # [](https://huggingface.co/Gryphe/Pantheon-Reasoning-27B#model-details)Model details Base model is [llmfan46/Qwen3.6-27B-uncensored-heretic-v2-Native-MTP-Preserved](https://huggingface.co/llmfan46/Qwen3.6-27B-uncensored-heretic-v2-Native-MTP-Preserved), and from what I can tell this worked out very, very nicely in regards to refusal reduction and writing capabilities. I considered Gemma 4 31B but that model has been an absolute pain to train. Something something special snowflake architectures. (grumble, grumble) All training sources include full reasoning traces, with thinking active across every assistant turn: * **Pantheon data** (\~28%) - the core Pantheon roleplay corpus with reasoning traces back-generated using the method described below * **Opus-4.6-Reasoning-24k** (\~21%) - a cleaned and deduplicated aggregation of Claude Opus 4.6 reasoning traces covering general instruction-following, STEM, and coding; provides the broad reasoning backbone * **WorldSim data** (\~16%) - long-form Opus 4.6 narrative roleplay with native reasoning traces, focusing on extended storytelling, character immersion, and emergent world logic, cobbled together through various experiments - mainly third person present tense but has a bit of everything + cliché cleaned, of course! * **Text adventure data** (\~16%) - high stakes interactive fiction and text adventure content with reasoning back-generated, lending the model a more grounded, prose-forward writing style * **General roleplay data** (\~16%) - a broad collection of highly varied roleplay transcripts with reasoning back-generated, helping the model generalise well to arbitrary character setups * **Tiamat data** (\~3%) - character and roleplay dataset originally built for [Tiamat-24B-Magistral](https://huggingface.co/Gryphe/Tiamat-24B-Magistral), featuring a multi-step generation/extension/improvement pipeline with critic-improver rewrites to reduce AI clichés, with reasoning back-generated for each exchange The model was trained with `preserve_thinking: true`, so thinking tags remain active across all assistant turns in multi-turn conversations, not just the first.

Original Article

View Cached Full Text

Cached at: 05/30/26, 11:18 AM

Gryphe/Pantheon-Reasoning-27B · Hugging Face

Source: https://huggingface.co/Gryphe/Pantheon-Reasoning-27B

https://huggingface.co/Gryphe/Pantheon-Reasoning-27B#pantheon-reasoning-27bPantheon-Reasoning-27B

An experiment in bringing reasoning capability to the Pantheon roleplay series in the form of an uncensored dense Qwen 3.6 27B. This specific model can be thought of as a successor to both the Pantheon series and the one-time Codex release since I used such a large variety of data this time around.

Yet another theory being tested this time around: take the data that Pantheon is built on, pair it with full thinking traces, and let the model reason its way through character work — weighing tone, planning narrative beats, considering how a character would actually respond before committing to a line. Whether that meaningfully improves roleplay quality over a non-reasoning model is a question you’ll hopefully be able to help me answer.

GGUF quantsare available here.

https://huggingface.co/Gryphe/Pantheon-Reasoning-27B#model-detailsModel details

Base model isllmfan46/Qwen3.6-27B-uncensored-heretic-v2-Native-MTP-Preserved, and from what I can tell this worked out very, very nicely in regards to refusal reduction and writing capabilities.

I considered Gemma 4 31B but that model has been an absolute pain to train. Something something special snowflake architectures. (grumble, grumble)

All training sources include full reasoning traces, with thinking active across every assistant turn:

Pantheon data(~28%) - the core Pantheon roleplay corpus with reasoning traces back-generated using the method described below
Opus-4.6-Reasoning-24k(~21%) - a cleaned and deduplicated aggregation of Claude Opus 4.6 reasoning traces covering general instruction-following, STEM, and coding; provides the broad reasoning backbone
WorldSim data(~16%) - long-form Opus 4.6 narrative roleplay with native reasoning traces, focusing on extended storytelling, character immersion, and emergent world logic, cobbled together through various experiments - mainly third person present tense but has a bit of everything + cliché cleaned, of course!
Text adventure data(~16%) - high stakes interactive fiction and text adventure content with reasoning back-generated, lending the model a more grounded, prose-forward writing style
General roleplay data(~16%) - a broad collection of highly varied roleplay transcripts with reasoning back-generated, helping the model generalise well to arbitrary character setups
Tiamat data(~3%) - character and roleplay dataset originally built forTiamat-24B-Magistral, featuring a multi-step generation/extension/improvement pipeline with critic-improver rewrites to reduce AI clichés, with reasoning back-generated for each exchange

The model was trained withpreserve\_thinking: true, so thinking tags remain active across all assistant turns in multi-turn conversations, not just the first.

https://huggingface.co/Gryphe/Pantheon-Reasoning-27B#reasoning-back-generationReasoning back-generation

For the Pantheon, text adventure, Tiamat, and general roleplay data, thinking traces were generated using DeepSeek 3.2 after the fact rather than being native to the source material. I tried V4 Flash as well but it proved to be terrible at this specific task. The approach prompts the model to thinkas a writer planning their next response— before writing — rather than annotating a response that already exists. This distinction matters: the goal is genuine forward planning (considering character psychology, tone, and narrative direction), not post-hoc explanation.

Each generated trace was validated by a judge model before being kept. Traces that slipped into character voice, produced pure restatement, or read as analysis rather than planning were rejected and retried. The result is thinking that reflects real craft decisions rather than a summary of what the response contains.

The theory is that this reasoning ties semi-seamlessly into Qwen 3.6 27B’s native training and therefore enhances, rather than blatantly overwrites.

https://huggingface.co/Gryphe/Pantheon-Reasoning-27B#what-is-pantheonWhat is Pantheon?

Pantheon is my ongoing series of roleplay-focused finetunes built around a collection of diverse personas — characters with distinct personalities, voices, accents and mannerisms. Though I made sure to mention exactly which personas these were in the past in reality I’m generally the only one bothering to actually use them (lol) so I’m not going to bother with a huge list this time around.

TLDR: Ten personas put through hundreds of scenarios, from good to bad and anything in-between.

https://huggingface.co/Gryphe/Pantheon-Reasoning-27B#inferenceInference

These settings have been working well for me:

"temperature": 1.0,
"repetition_penalty": 1.0,
"min_p": 0.05

Reasoning models seem to work better without a repetition penalty — likely because it also affects the thinking traces, even though those aren’t visible in the output.

I obviously recommend leaving thinking enabled, and ideally withpreserve\_thinkingturned on. Having said that, I’m also very curious about non-reasoning performance!

https://huggingface.co/Gryphe/Pantheon-Reasoning-27B#prompt-formatPrompt Format

The model was trained using ChatML via Qwen3.6’s chat template, which should be applied automatically.

Since reasoning doesn’t tend to play nice with character name prefixes enabled I’m inclined to recommend against using them.

https://huggingface.co/Gryphe/Pantheon-Reasoning-27B#notesNotes

This is, like most of my releases nowadays, a research release and hasn’t gone through extensive quality testing beyond basic sanity checks. The core question — does reasoning actually help roleplay, or does it just add latency? — is one I’m genuinely curious about, and your feedback will be far more informative than my own bias here. Let me know what you find!

https://huggingface.co/Gryphe/Pantheon-Reasoning-27B#creditsCredits

Everyone fromAnthracite! Hi, guys!
Latitude, for which I am still producing finetunes on a regular basis, helping me keep my skills sharp and up-to-date!
All the original dataset authors behind the Opus 4.6 reasoning data — full credits in thedataset card
All the folks I chat with on a daily basis on Discord! You know who you are.
Anyone I forgot to mention, just in case!

Gryphe/Pantheon-Reasoning-27B · Hugging Face

Gryphe/Pantheon-Reasoning-27B · Hugging Face

https://huggingface.co/Gryphe/Pantheon-Reasoning-27B#pantheon-reasoning-27bPantheon-Reasoning-27B

https://huggingface.co/Gryphe/Pantheon-Reasoning-27B#model-detailsModel details

https://huggingface.co/Gryphe/Pantheon-Reasoning-27B#reasoning-back-generationReasoning back-generation

https://huggingface.co/Gryphe/Pantheon-Reasoning-27B#what-is-pantheonWhat is Pantheon?

https://huggingface.co/Gryphe/Pantheon-Reasoning-27B#inferenceInference

https://huggingface.co/Gryphe/Pantheon-Reasoning-27B#prompt-formatPrompt Format

https://huggingface.co/Gryphe/Pantheon-Reasoning-27B#notesNotes

https://huggingface.co/Gryphe/Pantheon-Reasoning-27B#creditsCredits

Similar Articles

Qwen/Qwen3.6-27B-FP8

Qwen/Qwen3.6-27B

Qwen3.6-27B-GGUF is here!

hesamation/Qwen3.6-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled-GGUF

Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled

Submit Feedback