Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF

Hugging Face Models Trending Models

Summary

A GGUF quantized version of the Qwopus3.6-27B-Coder-MTP model is released on Hugging Face, optimized for local inference and compatible with Transformers, vLLM, SGLang, and Unsloth Studio.

Task: image-text-to-text Tags: transformers, gguf, llama.cpp, image-text-to-text, vision, multimodal, text-generation-inference, unsloth, conversational, qwen3_6, reasoning, chain-of-thought, lora, sft, agent, tool-use, function-calling, coder, en, zh, es, ru, ja, dataset:Jackrong/Claude-opus-4.6-TraceInversion-9000x, dataset:Jackrong/Claude-opus-4.7-TraceInversion-5000x, dataset:lambda/hermes-agent-reasoning-traces, base_model:Jackrong/Qwopus3.6-27B-v2, base_model:adapter:Jackrong/Qwopus3.6-27B-v2, license:apache-2.0, endpoints_compatible, region:us
Original Article
View Cached Full Text

Cached at: 06/12/26, 02:52 PM

Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF · Hugging Face

Source: https://huggingface.co/Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF

Instructions to use Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

  • Libraries
  • TransformersHow to use Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF with Transformers: # Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages) # Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF", dtype="auto")
  • Notebooks
  • Google Colab
  • Kaggle
  • Local AppsSettings
  • vLLMHow to use Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF with vLLM: ##### Install from pip and serve model # Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' ##### Use Docker docker model run hf.co/Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF
  • SGLangHow to use Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF with SGLang: ##### Install from pip and serve model # Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' ##### Use Docker images docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'
  • Unsloth StudioHow to use Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF with Unsloth Studio: ##### Install Unsloth Studio (macOS, Linux, WSL) curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF to start chatting ##### Install Unsloth Studio (Windows) irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF to start chatting ##### Using HuggingFace Spaces for Unsloth # No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF to start chatting ##### Load model with FastModel pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF", max_seq_length=2048, )
  • Docker Model RunnerHow to use Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF with Docker Model Runner: docker model run hf.co/Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF
  • Browse Quantizationsto use this model inllama.cpp,Ollama,LM Studio, or any compatible app.

🪐 Qwopus-3.6-27B-Coder

Coder SFT Release

Agentic Coding & Tool-Use Reasoning Model Fine-Tuned on Qwopus3.6-27B-v2

🧬 Trace Inversion & Negentropy🧠 27B Dense Model⚡ Agentic Coding🛠️ Tool Calling & Agent🏆 SWE-bench Verified: 67.0% (off-thinking)

💡What is Qwopus-3.6-27B-Coder?

🪐Qwopus-3.6-27B-Coderis a reasoning-enhanced agentic coding model built on top ofQwopus3.6-27B-v2. It inherits the powerful reasoning foundation of the v2 base — which achieved87.43% MMLU-Pro (300ex)and75.25% SWE-bench Verified— and further specializes it for agentic code generation, structured tool calling, debugging, and instruction-following in developer workflows. The model is designed to excel at repository-level coding tasks, multi-turn tool orchestration, and complex logical reasoning under realistic agent environments.

🧩 Agentic CodingOptimized for repository-level coding, debugging, patch generation, and structured multi-step development workflows.

🛠️ Tool CallingLearns from real agent trajectories with tool definitions, tool calls, and environment feedback for robust multi-turn execution.

🧬 Trace InversionInherits the full Qwopus training recipe with reconstructed step-by-step reasoning trajectories from Claude Opus.

🚀 27B ScaleDense 27B parameters with native long-context support, delivering deep reasoning with practical single-GPU deployability.

Community Release Notice: Qwopus-3.6-27B-Coder is an experimental community release intended for research, evaluation, and agent workflow exploration. It has not undergone full safety evaluation or broad general-domain benchmarking.

Benchmark Status: The first completed benchmark is SWE-bench Verified full 500 inthinking-off / no-thinking mode, where the Q5_K_M 27B GGUF run resolved335/500 = 67.0%. Other benchmark suites remain pending and will be updated as testing completes.


https://huggingface.co/Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF#%F0%9F%92%A1-1-base-model-training-stack–collaboration💡 1. Base Model, Training Stack & Collaboration

🧠1.1 Base Model: Qwopus3.6-27B-v2

Qwopus3.6-27B-v2is a reasoning-enhanced dense language model built onQwen3.6-27B. Through a multi-stage curriculum learning pipeline and Trace Inversion augmentation, it achieves strong performance across knowledge, coding, and reasoning benchmarks. This coder variant inherits that foundation and extends it with specialized coding and tool-use data.

AttributeSpecifications & Details🧠 ArchitectureDense Transformer / 27 Billion Parameters🏢 Base DeveloperAlibaba Cloud (DAMO Academy) — Qwen3.6-27B🎯 Primary FocusAgentic coding, tool-use stability, code debugging, structured instruction following, repository-level tasks🧬 Distillation StrategyTrace Inversion + high-quality agent trajectories + curriculum SFT📄 Context WindowNative support up to 32K tokens (fine-tuning target); compatible with longer contexts via RoPE/YaRN scaling

🧪1.2 Hardware Cooperation & Joint Collaboration

This project is built in close collaboration and joint effort with engineerKyle Hessling, whose hardware infrastructure and training support made stable 27B-scale fine-tuning and evaluation possible.

👉You can follow him for hardware and model training updates on X / Twitter:@KyleHessling1

🦥1.3 Fine-Tuning Framework (Unsloth)

The model training workflow is accelerated and memory-optimized withUnsloth. Special thanks to the Unsloth team for making efficient large-model fine-tuning accessible.

⚡1.4 MTP Variant: Faster Speculative Decoding

AMulti-Token Prediction (MTP)variant of this model is also available, featuring auxiliary prediction heads (draft=2) for speculative decoding. Based on the Qwopus3.6-27B-v2-MTP benchmark, the MTP variant achieved~1.66x speedupover standard decoding with preserved accuracy. See theQwopus3.6-27B-v2-MTPmodel card for detailed MTP performance analysis.

🌟The custom MTP heads processing pipeline is open-sourced inqwen-mtp-gguf. If you find this toolkit helpful, please consider leaving a star on GitHub!


https://huggingface.co/Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF#%F0%9F%93%96-2-background–motivation📖 2. Background & Motivation

🎯2.1 Why a 27B Coder Model?

The Qwopus coder line has demonstrated strong results at the 4B and 9B scales. The 27B coder variant represents a significant leap in reasoning depth, code generation quality, and tool-use robustness. At 27B parameters, the model has sufficient capacity to internalize complex repository structures, multi-file dependencies, and nuanced tool-calling patterns — while remaining deployable on a single GPU (e.g., RTX 5090). This scale bridges the gap between compact local models and expensive API-based solutions, making it suitable for production agentic coding workflows.

🧬2.2 Trace Inversion & Agent Behavior

Commercial and frontier models often expose only compressed reasoning summaries. Qwopus-style training usesTrace Inversionto reconstruct these compressed “Reasoning Bubbles” into fuller learnable reasoning traces. For coding, this is paired with agent trajectories that include tool definitions, tool calls, and real feedback, teaching the model to reason through interactive work rather than only produce static answers.

This model integrates:

  • claude-opus-4.6-traceInversion-9000x: 9,000 high-value, fully reconstructed step-by-step reasoning trajectories.
  • claude-opus-4.7-traceInversion-5000x: 5,000 complex multi-turn logic and mathematics samples optimized for negative entropy reconstruction.
  • lambda/hermes-agent-reasoning-traces: ~10,000 high-quality multi-turn tool-calling trajectories from GLM-5.1 and kimi-4.6 models.

📦2.3 Special Dataset: Trace Inversion & Agent Traces

Trace Inversion:Uses a specialized logical reconstructor,Trace-Inverter-4B, to reverse-engineer compressed reasoning bubbles into complete, step-by-step learnable CoT chains. This approach addresses the“Information Entropy Trap”— where direct imitation of compressed summaries leads to reasoning fractures — by ensuring the model learns continuous, rigorous logical derivations.

**Agent Traces (lambda/hermes-agent-reasoning-traces):**Each sample contains real multi-turn tool execution results (not fabricated outputs), with step-by-step reasoning inside<think\>tags. Coverage includes:

  • **Terminal & Coding:**Script writing, debugging, environment configuration
  • **Repository Tasks:**Bug fixing, refactoring, code review
  • **Browser Automation:**Web navigation, scraping, form filling
  • **Agent Tools:**Memory persistence, task delegation, skill management

https://huggingface.co/Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF#%F0%9F%93%8A-3-performance-benchmarks📊 3. Performance Benchmarks

📊 Evaluation & Performance Metrics

First completed result: SWE-bench Verified full 500, evaluated in no-thinking mode for fast local agentic coding.

No-Thinking SWE-bench ResultThis benchmark was intentionally run withthinking disabled. The goal is to show the model’s practical coding ability when used as a fast local agent, without relying on long visible reasoning traces. On an RTX 5090 with MTP enabled, the model runs at approximately100 tokens/sec, making this result especially relevant for interactive development workflows.

SWE-bench Verified67.0%335 / 500 resolved

Inference ModeThinking Offno visible CoT required

Local Throughput~100 t/sRTX 5090 + MTP

Evaluation BuildQ5_K_M27B GGUF quant

Evaluation setup:SWE-bench Verifiedfull 500, Qwopus-3.6-27B-CoderQ5_K_MGGUF,thinking-off / no-thinking mode. Final score:335/500 = 67.0%.

💻3.1 SWE-bench Verified: Full 500 No-Thinking Result

SWE-bench Verified measures whether a model can solve real GitHub issues by editing repository code and passing the hidden tests. In this run, Qwopus-3.6-27B-Coder solved335 out of 500verified tasks while running inno-thinking mode, prioritizing direct action quality and local speed over long explicit reasoning.

MetricResultNotesFinal score335/500 = 67.0%Full SWE-bench Verified 500-task splitModeThinking offNo long visible chain-of-thought during evaluationQuantizationQ5_K_M GGUFLocal 27B quantized deploymentThroughput~100 tokens/secObserved on RTX 5090 with MTP enabled

🧩3.2 Repository-Level Breakdown

The result is strongest on practical library-maintenance tasks such as scikit-learn, xarray, requests, and Django, while also showing solid coverage on symbolic mathematics, test infrastructure, documentation tooling, and plotting libraries.

Repository

Resolved

Rate

scikit-learn

27/32

84%

pydata/xarray

18/22

82%

psf/requests

6/8

75%

django

166/231

72%

sympy

48/75

64%

pytest

12/19

63%

sphinx-doc

26/44

59%

matplotlib

20/34

59%

astropy

9/22

41%

pylint

2/10

20%

⚖️3.3 SWE-bench Verified Reference Comparison

Important comparison note:the reference scores below are from external model reports and are generallythinking-enabledor harness-specific where noted. Qwopus-3.6-27B-Coder is shown here as ano-thinking, quantized local run, so this table should be read as positioning context rather than a strict same-mode leaderboard.

ModelThinking ModeSWE-bench VerifiedContextQwopus-3.6-27B-CoderOff / No-thinking67.0Q5_K_M, RTX 5090 + MTP, ~100 t/sOpenAI GPT-5On70.1Thinking-on referenceOpenAI GPT-5 miniOn59.8Thinking-on referenceOpenAI GPT-5 nanoOn34.8Thinking-on referenceGLM-4.7On70.6OpenHands referenceGLM-4.5-AirOn57.6OpenHands referenceQwen3-Coder-30B-A3B-Instruct (2025-07)Off / No-thinking70.3No-thinking referenceClaude 4.0 OpusOn67.6Thinking-on referenceClaude 4.5 OpusOn80.9Thinking-on referenceQwen3.6-27BOn77.2Thinking-on referenceQwen3.5-397B-A17BOn76.2Thinking-on referenceQwen3.5-27BOn75.0Thinking-on referenceQwen3.6-35B-A3BOn73.4Thinking-on referenceGemma4-31BOn52.0Thinking-on referenceGemma4-26B-A4BOn17.4Thinking-on reference

🎮3.4 Live Thinking-Disabled Demo: Boat Survival

Kyle Hessling also tested Qwopus-3.6-27B-Coder in a small interactive game environment with thinking disabled. The demo is a practical smoke test for fast decision-making, instruction adherence, and local responsiveness beyond static benchmark tables.

Boat Survival thinking-disabled Qwopus-3.6-27B-Coder demo screenshot

**Takeaway:The headline is not that this no-thinking local run beats every thinking-enabled frontier reference. The important result is that a quantized 27B local coder can reach67.0%**on the full SWE-bench Verified split while staying fast enough for interactive agent loops. This makes Qwopus-3.6-27B-Coder a practical option for developers who want strong repository-level repair performance without paying the latency cost of long reasoning mode.


https://huggingface.co/Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF#%F0%9F%97%BA%EF%B8%8F-4-training–data-pipeline-overview🗺️ 4. Training & Data Pipeline Overview

The training process fusesTrace Inversiondata augmentation with aThree-Stage Curriculum Learningpipeline. The core engineering focuses on expanding context length gradually while training on reconstructed reasoning traces and real agent trajectories to keep the output format stable.

[ 🗺️ Trace Inversion: Reconstructing Distillation Workflow ]

  A. Surrogate Model Training (Trace Inverter)
     Open-source Model (GLM-5.1 / DS-V4) ──► Complete Reasoning Chain ──► [ Qwen3-235B Compression ] ──► Reasoning Bubbles
                                              │                                   │
                                              └──────────► [ Training ] ◄─────────┘
                                                   (Base: Qwen3-4B-Instruct)
                                                   (Result: Trace-Inverter-4B)

  B. Inversion Phase: Reconstructing Claude-4.7-Max
     _______________________________________________________
    |                                                       |
    |  Claude-4.7-Max API ──► Compressed Bubbles + Answer   |
    |_______________________________________________________|
                      │
                      ▼
    [ 🧠 Trace-Inverter-4B (Logic Reconstructor) ] ──► Synthetic Deep Reasoning Trace (Learnable CoT)
                      │
                      ▼
    [ 🧩 Data Splicing ] ◄────────── (Original Prompt + Response)
    (Embed reconstructed CoT in <think> tags, splicing with original prompt/response)
                      │
                      ▼
             (Result: claude-opus-4.6/4.7 inverted sets)

  C. Final Coder SFT Curriculum Pipeline
     ___________________________________________
    |                                           |
    |       Base Model (Qwopus3.6-27B-v2)       |
    |___________________________________________|
                      │
                      ▼
    [ 📦 Phase 1: Format Inception ] ──► [ 🛠️ Phase 2: Agent/Coding Expansion ] ──► [ 🚀 Phase 3: Long-Context SFT ]
      ( < 4096 tokens )                     ( 4096 - 8192 tokens )                     ( 8192 - 32K tokens )
      (Stable <think> format)               (Tool traces + coding tasks)               (Long / multi-turn / replay)
                      │                                                                            │
                      └─────────────────────────────┬──────────────────────────────────────────────┘
                                                    ▼
                                   _______________________________________________
                                  |                                               |
                                  |   🌟 Final Model: Qwopus-3.6-27B-Coder        |
                                  |_______________________________________________|

Due to the complex and diverse format of agent trajectory datasets, rigorous cleaning and format standardization were applied to ensure data quality.


https://huggingface.co/Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF#%F0%9F%93%9A-5-three-stage-curriculum-learning📚 5. Three-Stage Curriculum Learning

To steadily scale reasoning quality under long-context inference,Qwopus-3.6-27B-Coderuses a curriculum-style data mixture building on the approach proven in the Qwopus coder line. The model is first stabilized on short, clean reasoning samples, then exposed to complex coding and agent traces, and finally reinforced with longer contexts plus replay data.

Curriculum StageFocus & Sample CharacteristicsStrategy Details📦 Stage 1: Format Inception• Limit context within 4,096 tokens • Emphasize stable reasoning templatesFocuses on short-to-medium length, cleanly formatted reasoning samples. The primary goal is to establish reliable structured reasoning output, including stable<think\>boundaries, before exposing the model to longer chains.🛠️ Stage 2: Complexity Expansion• Extend length to 4,096 - 8,192 tokens • Introduce higher-difficulty coding and agent samplesGradually increases the ratio of complex reasoning chains, code debugging tasks, and multi-turn tool traces. The model learns to connect reasoning, action selection, and environment feedback.🚀 Stage 3: Long-Context SFT• Progressively scale samples up to 32K tokens • Use short-sample replayPushes the model toward long-context and multi-turn reasoning while replaying high-quality short samples to reduce instruction-following drift. The 32K figure describes the fine-tuning sequence/data mixture target, not a hard architectural limit.

https://huggingface.co/Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF#%F0%9F%8E%AF-6-recommended-use-cases–known-limits🎯 6. Recommended Use Cases & Known Limits

✅Good Fits

Agentic code generation and repository-level debugging, complex tool-call orchestration, structured multi-step reasoning, code review and patch generation, DevOps scripting and automation, and any workflow requiring deep logical reasoning combined with tool execution.

❌Known Limits

As a specialized coder model, it has not undergone comprehensive general-domain safety evaluation. Capability decay may occur in non-coding or non-agent tasks. Tool-call behavior depends strongly on prompt format and tool schema consistency. Long-context performance beyond 32K may require RoPE/YaRN scaling.

Deployment note: The model may emit reasoning inside<think\>and</think\>tags. Front-end applications and agent frameworks should parse or hide these sections where appropriate. For tool calling, ensure the prompt format and system prompt match the training data configuration to activate agent capabilities.


https://huggingface.co/Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF#%E2%9A%A0%EF%B8%8F-7-training–deployment-notes⚠️ 7. Training & Deployment Notes

Compatibility Notes - Tool Calling Format: To activate the model’s agent capabilities, ensure the prompt format and system prompt include appropriate tool definitions and match the training data format. - Reasoning Output Extraction: The model’s thinking process is wrapped in<think\>and</think\>tags. Front-end applications may need to parse and hide these tags. - Long-Context Usage: For contexts beyond 32K, consider enabling RoPE/YaRN scaling (e.g.,\-\-rope\-scaling yarn \-\-rope\-scale 4 \-\-yarn\-orig\-ctx 32768inllama\.cpp).


https://huggingface.co/Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF#%F0%9F%93%8B-8-benchmark-progress📋 8. Benchmark Progress

The first completed evaluation is the no-thinking SWE-bench Verified run reported above. Additional local agentic benchmarks remain pending and will be added after testing.

BenchmarkStatusResult / ReferenceSWE-bench Verified✅ Completed335/500 = 67.0% (thinking-off, Q5_K_M, RTX 5090 + MTP)BugFind-15📋 Pending9B reference: 79HermesAgent-20📋 Pending9B reference: 85ToolCall-15📋 Pending9B reference: 100InstructFollow-15📋 Pending9B reference: 93


https://huggingface.co/Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF#%F0%9F%93%9A-9-resources–guides📚 9. Resources & Guides

👉**GitHub Repository: Jackrong-llm-finetuning-guide**Access the repository to dive into the codebase and reproduce our results.

👉**Qwen MTP GGUF Processing Workflow**A custom splitting and merging methodology designed specifically for Qwen series Multi-Token Prediction (MTP) heads.

👉**benchlocal Evaluation Framework**The evaluation framework used to run the local agentic and coding benchmarks.

👉**Qwopus3.6-27B-v2 Model Card**Base model card with full MMLU-Pro, SWE-bench, and throughput benchmarks.


https://huggingface.co/Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF#%F0%9F%99%8F-10-acknowledgements🙏 10. Acknowledgements

Special thanks to:

  • TheQwen teamfor providing the powerful Qwen3.6-27B base model.
  • Unslothfor providing the highly efficient fine-tuning framework.
  • Kyle Hesslingfor the close collaboration on hardware, training infrastructure, and evaluation support.
  • Open-source datasets and community contributors, particularly**lambda/hermes\-agent\-reasoning\-traces**for the high-quality agent trajectory data.

https://huggingface.co/Jackrong/Qwopus3.6-27B-Coder-MTP-GGUF#%F0%9F%93%96-11-citation📖 11. Citation

@misc{jackrong_qwopus36_27b_coder,
  title        = {Qwopus-3.6-27B-Coder},
  author       = {Jackrong},
  year         = {2026},
  publisher    = {Hugging Face},
  howpublished = {\url{https://huggingface.co/Jackrong/Qwopus-3.6-27B-Coder}}
}

Model tree forJackrong/Qwopus3.6-27B-Coder-MTP-GGUFhttps://huggingface.co/docs/hub/model-cards#specifying-a-base-model

Datasets used to trainJackrong/Qwopus3.6-27B-Coder-MTP-GGUF

#### lambda/hermes-agent-reasoning-traces Viewer• UpdatedApr 17 • 14.7k • 2.93k • 356 #### Jackrong/Claude-opus-4.6-TraceInversion-9000x Viewer• Updated24 days ago • 8.67k • 1.77k • 67 #### Jackrong/Claude-opus-4.7-TraceInversion-5000x Viewer• Updated24 days ago • 4.76k • 1.76k • 55

Similar Articles

Jackrong/Qwopus3.6-27B-v2-MTP-GGUF

Hugging Face Models Trending

Jackrong/Qwopus3.6-27B-v2-MTP-GGUF is a quantized GGUF version of a 27B parameter language model, hosted on Hugging Face with instructions for use with various libraries and tools.

Jackrong/Qwopus3.6-27B-v2-GGUF

Hugging Face Models Trending

Qwopus3.6-27B-v2 is a reasoning-enhanced fine-tuned version of Qwen3.6-27B, using Trace Inversion datasets and curriculum learning, released as GGUF for efficient inference.

Jackrong/Qwopus3.5-9B-Coder-MTP-GGUF

Hugging Face Models Trending

Jackrong releases Qwopus3.5-9B-Coder-MTP-GGUF, a Qwen-based 9B coding model fine-tuned with Multi-Token Prediction (MTP) architecture, achieving 35.8% throughput improvement and 8.3% accuracy gain over the base model, with perfect scores on coding and math benchmarks.

Jackrong/Qwopus3.6-35B-A3B-v1-GGUF

Hugging Face Models Trending

Jackrong releases Qwopus3.6-35B-A3B-v1, a reasoning-enhanced fine-tune of Alibaba's Qwen3.6 MoE model, optimized for logic and agentic coding with 35B total parameters and 3B active parameters.

Jackrong/Qwopus-GLM-18B-Merged-GGUF

Hugging Face Models Trending

Jackrong released Qwopus-GLM-18B-Merged-GGUF, a 64-layer frankenmerge combining two Qwen3.5-9B finetunes into an ~18B parameter model, healed with 1000-step LoRA fine-tuning to fix layer boundary issues. The model achieves 90.9% on capability benchmarks while using less than half the VRAM of Qwen 3.6-35B MoE.