llm-compression

#llm-compression

Weight Pruning Amplifies Bias: A Multi-Method Study of Compressed LLMs for Edge AI

arXiv cs.LG ↗ · 2d ago Cached

This study reveals a 'Smart Pruning Paradox' where activation-aware pruning methods like Wanda preserve perplexity but significantly amplify bias in Large Language Models deployed on edge devices.

0 favorites 0 likes

#llm-compression

AngelSlim/Hy-MT1.5-1.8B-1.25bit

Hugging Face Models Trending ↗ · 2026-04-28 Cached

Tencent's AngelSlim team released Hy-MT1.5-1.8B-1.25bit, a highly compressed 1.25-bit machine translation model supporting 33 languages that fits in 440MB for on-device use. It utilizes the Sherry quantization algorithm to achieve world-class translation quality comparable to much larger models.

1 favorites 1 likes

llm-compression

Weight Pruning Amplifies Bias: A Multi-Method Study of Compressed LLMs for Edge AI

AngelSlim/Hy-MT1.5-1.8B-1.25bit

Submit Feedback