@0xSero: Just added 2 new model compressions: Hy3-FP8 & NVFP4 I recommend trying this model it's very strong and fits on 256gb o…

X AI KOLs Following Models

Summary

0xSero has released new FP8 and NVFP4 quantized versions of the Tencent Hy3-preview model, enabling it to run on 256GB VRAM with full context.

Just added 2 new model compressions: Hy3-FP8 & NVFP4 I recommend trying this model it's very strong and fits on 256gb of vram with full context https://t.co/UQI63BCFiJ
Original Article
View Cached Full Text

Cached at: 05/10/26, 08:23 AM

Just added 2 new model compressions:

Hy3-FP8 & NVFP4

I recommend trying this model it’s very strong and fits on 256gb of vram with full context

https://t.co/UQI63BCFiJ


0xSero/Hy3-preview-NVFP4 · Hugging Face

Source: https://huggingface.co/0xSero/Hy3-preview-NVFP4

https://huggingface.co/0xSero/Hy3-preview-NVFP4#hy3-preview-nvfp4a16Hy3-preview NVFP4A16

This is a checkpoint-onlyNVFP4A16quantization oftencent/Hy3\-preview, produced withllmcompressor\.entrypoints\.model\_free\.model\_free\_ptq.

  • Base model:tencent/Hy3\-preview
  • Quantization scheme:NVFP4A16
  • Ignored modules/patterns:lm\_head, model\.embed\_tokens, re:\.\*router\.gate$, re:\.\*expert\_bias$
  • Source snapshot: recorded inQUANTIZATION\_MANIFEST\.json
  • License: inherits Tencent Hy Community License Agreement from the base model; originalLICENSEis included.

https://huggingface.co/0xSero/Hy3-preview-NVFP4#notesNotes

This release quantizes safetensors weights without importing the custom HYV3 model class. Router gates, expert bias tensors, embeddings, and lm_head are preserved unquantized for compatibility/conservatism.

Similar Articles