Tag
The paper identifies 'Load-Bearing Wall' dimensions in pre-trained models that retain task-specific knowledge not fully captured by task vectors in model merging, and proposes PACT (PreserveAnchoredCores) to preserve these cores, achieving state-of-the-art performance across benchmarks.
ProfiLLM introduces an agentic LLM pipeline that generates utility-aligned user profiles from platform-scale behavioral logs for industrial ride-hailing dispatch, achieving significant improvements in outcome prediction and GMV in production at DiDi.
This paper presents a self-supervised transfer learning approach for parking spot occupancy recognition that achieves high accuracy (up to 97.8%) with minimal labeled data using a two-stage training strategy with SimCLR and ResNet-50.
Explores whether LoRA is the best parameter-efficient fine-tuning technique and introduces the PEFT library's tools to compare methods.
Fine-tuning open models like Alibaba's Qwen with LoRA can match or exceed frontier model performance on error classification tasks.
A LoRA that adapts Ideogram 4 to generate high-quality images in as few as 2 steps without CFG, using a novel continuous turbo training method.
VibeThinker, a 3B parameter model fine-tuned on Qwen 2.5, achieves performance comparable to Claude Opus 4.5 and much larger models like DeepSeek v3 through innovative post-training that includes multi-path thinking and staged training on math, coding, and science.
Daniel Han built Unsloth, a tool that rewrites GPU kernels to make fine-tuning 2-3 times faster on a single GPU, enabling many open-source users to train models without a cluster.
A community fine-tune of Qwen3.6-27B improves real bug-fixing on SWE-bench while maintaining quality, unlike synthetic distillations that regress.
A curated list of X/Twitter accounts that explain AI concepts and teach how to build tools, agents, and frameworks, covering retrieval, testing, fine-tuning, and more.
Mia-AiLab releases Qwable-3.6-27b-MTP, a full fine-tuned checkpoint of Qwen3.6-27B using a cleaned Fable 5 reasoning and instruction dataset, focused on code, structured reasoning, and local inference with MTP layers.
This paper presents a method for fine-tuning LLMs to predict PHQ-9 depression severity scores directly from transcripts of conversations with an AI mental health application, achieving strong correlation with clinical thresholds using a augmented dataset of 6,283 users.
This study evaluates bilingual fine-tuning with language identification tokens for improving ASR in low-resource languages across nine diverse language pairs, finding that high LID accuracy is beneficial and that providing the LID token at inference can boost performance when LID accuracy is low.
Introduces SuCo, a two-stage training framework for Large Reasoning Models that uses the concept of Minimal Sufficient CoT to reduce reasoning tokens while improving accuracy across math, code, and science benchmarks.
This paper addresses the problem of spoken language adherence in multimodal LLMs for ASR, proposing a soft prompting approach and novel metric to quantify language violations. It evaluates three mitigation strategies—zero-shot prompting, supervised fine-tuning, and chain-of-thought reasoning—across multiple languages to improve transcription fidelity.
This paper proposes a distribution-aware training approach for modeling next-event predictions in concurrent Go programs, treating scheduler nondeterminism as a signal. Fine-tuning a 7B model on fewer than a thousand traces achieves 36.2% accuracy on production bugs, outperforming Gemini 3.5 Flash zero-shot.
RepSelect introduces a method for robust LLM unlearning that isolates forget-set-specific representations by collapsing top principal components of weight gradients, achieving 4-50× better robustness against relearning attacks compared to existing baselines across multiple model families.
A Microsoft and SJTU research paper introduces FastContext, a dedicated exploration subagent for coding agents that separates repository navigation from task solving, reducing orchestrator token usage by up to 60% and improving resolution rates by 5.5% on SWE-bench benchmarks.
Introduces OmniAgent, an omni-modal agent that uses an iterative Observation-Thought-Action cycle with active perception to achieve superior long video understanding, outperforming larger models like Qwen2.5-VL-72B on benchmarks.
GLM 5.2, an open-weight AI model comparable to top closed models, has been released and is now running on MLX on two Mac Studios (M3 Ultra).