model-pruning

#model-pruning

@dealignai: MiniMax m3, made for 128gb Mac’s Thank you to @hornsby_andrew for preparing the pruning calibration dataset and doing e…

X AI KOLs Timeline ↗ · 2026-06-18 Cached

A pruned and quantized version of MiniMax-M3 (MiniMax-M3-Medium-JANG_2L) optimized to run on 128GB Macs using vMLX, featuring 32% expert pruning and JANG_2L mixed-precision quantization to fit within ~105 GB.

0 favorites 0 likes

#model-pruning

Rethinking Layer Relevance in Large Language Models Beyond Cosine Similarity

arXiv cs.LG ↗ · 2026-05-15 Cached

This paper demonstrates that cosine similarity is a poor proxy for assessing layer importance in LLMs, and proposes using the actual accuracy drop from layer removal as a more robust metric.

0 favorites 0 likes

#model-pruning

Pruning Unsafe Tickets: A Resource-Efficient Framework for Safer and More Robust LLMs

arXiv cs.CL ↗ · 2026-04-20 Cached

This paper introduces a resource-efficient pruning framework that identifies and removes parameters associated with unsafe behaviors in large language models while preserving utility. Using gradient-free attribution and the Lottery Ticket Hypothesis perspective, the method achieves significant reductions in unsafe generations and improved robustness against jailbreak attacks with minimal performance loss.

0 favorites 0 likes

model-pruning

@dealignai: MiniMax m3, made for 128gb Mac’s Thank you to @hornsby_andrew for preparing the pruning calibration dataset and doing e…

Rethinking Layer Relevance in Large Language Models Beyond Cosine Similarity

Pruning Unsafe Tickets: A Resource-Efficient Framework for Safer and More Robust LLMs

Submit Feedback