Tag
UnslothAI announces that its 4-bit Qwen3.6 MTP GGUF model can search over 70 websites from a single prompt, running locally on 20GB RAM via Unsloth Studio. The update adds automatic MTP and speculative decoding support.
Cyankiwi introduced an updated version of their AWQ 4-bit quantization method that jointly optimizes scales and quantization ranges, achieving lower KL divergence than existing methods on Llama-3 models.