fine-tuning

Tag

Cards List
#fine-tuning

@no_stp_on_snek: what actually surprised me fine-tuning a small open model. note im failry new in this area so some of this may seem obv…

X AI KOLs Timeline · 11h ago Cached

A developer shares surprising lessons from fine-tuning a small open model, including that base models often already max out on intended improvements, the real weakness is behavior (caving), and fine-tuning requires careful measurement and balancing.

0 favorites 0 likes
#fine-tuning

Knowledge Agents: Beat Frontier Models with Better Structure (18 minute read)

TLDR AI · yesterday Cached

The article presents 'knowledge agents', a methodology that injects relevant knowledge into AI agents via a hybrid retrieval system, allowing smaller models to outperform large frontier models across specialized domains like financial markets, policy, and healthcare.

0 favorites 0 likes
#fine-tuning

Is Gemma 4 going to be the next Mistral (or Qwen3.6) one day? Concerning the lack of finetunes

Reddit r/LocalLLaMA · yesterday

An analysis exploring why Gemma 4, despite advantages like QAT and vision support, lacks community finetunes compared to Mistral, and whether community inertia will eventually shift.

0 favorites 0 likes
#fine-tuning

@gabepereyra: Harvey partnered with @appliedcompute to train a legal agent. We optimized each part of the agent stack, including the …

X AI KOLs Following · yesterday Cached

Harvey partnered with Applied Compute to train a legal agent, optimizing the agent stack and post-training the GLM-5.1 model using reward signals from their Legal Agent Benchmark.

0 favorites 0 likes
#fine-tuning

@0xSero: Highly recommended educational content. LoRA is one of the coolest things to dabble in, lets anyone fine tune models re…

X AI KOLs Timeline · yesterday Cached

This article delves into the principles of LoRA and its variants (QLoRA, VeRA, DoRA), explaining how low-rank decomposition reduces trainable parameters to enable efficient fine-tuning of large models.

0 favorites 0 likes
#fine-tuning

NEX-N2-mini: "There is no Pareto frontier. I am Pareto". This Qwen3.5-MoE fine tune fixed 3.5 and 3.6 overthinking apparently on my tests.

Reddit r/LocalLLaMA · yesterday

A fine-tuned version of Qwen3.5-MoE called NEX-N2-mini reportedly fixes overthinking issues seen in Qwen 3.5 and 3.6 models.

0 favorites 0 likes
#fine-tuning

@TheAhmadOsman: INCREDIBLE RESOURCE The MOST COMPLETE GUIDE for understanding LLMs from first principles is now available online to rea…

X AI KOLs Timeline · 2d ago Cached

A comprehensive free guide explaining LLMs from first principles, covering tokens, transformers, attention, fine-tuning, and local deployment.

0 favorites 0 likes
#fine-tuning

Good results fine tuning a local LLM like Qwen 3:0.6B to categorize questions

Hacker News Top · 2d ago Cached

A developer fine-tunes a small Qwen 3 0.6B model using the Unsloth framework to categorize household questions, achieving good results with only 850 training examples.

0 favorites 0 likes
#fine-tuning

@uzairansar: Qwythos-9B-Claude-Mythos-5 Fine Tune with 1M Context released! Empero just released their Claude Mythos Fine Tune based…

X AI KOLs Timeline · 2d ago Cached

Empero released Qwythos-9B-Claude-Mythos-5, a full-parameter reasoning model fine-tuned with 1M context, based on synthetic chain-of-thought data from Fable-5 and Mythos-5 session logs.

0 favorites 0 likes
#fine-tuning

@analogalok: gemma-4-12B-agentic-fable5-composer2.5 V2 is out. the agentic upgrade to the model trained on Fable 5's reasoning. Runn…

X AI KOLs Timeline · 2d ago Cached

A new fine-tuned version of Gemma 4 12B, trained on Fable 5's reasoning, achieves a significant jump in agentic coding benchmarks (from 15% to 55%) and can run locally on an 8GB VRAM GPU using a custom fork of llama.cpp.

0 favorites 0 likes
#fine-tuning

A Comparative Study of Pretrained Transformer Models for Quranic ASR: Speech Representations, Label Formats, and Dataset Composition

arXiv cs.AI · 3d ago Cached

This paper presents a systematic empirical study of fine-tuning pretrained Transformer models (Wav2Vec2.0, HuBERT, XLS-R) for Quranic Automatic Speech Recognition (ASR), achieving a WER of 0.08 on the EveryAyah subset and reducing training time from 140 to 40 hours, with Wav2Vec2-XLSR-53 providing the best representation.

0 favorites 0 likes
#fine-tuning

[NEW MODEL] SupraLabs just released supra-title-FFT-preview, 115K samples, almost 10x our first chat title dataset

Reddit r/LocalLLaMA · 3d ago

SupraLabs released supra-title-FFT-preview, a full fine-tuned 0.4B parameter model for chat title generation, trained on 115K samples — nearly 10x larger than their previous dataset.

0 favorites 0 likes
#fine-tuning

@0x0SojalSec: Imagine fine-tuning a 31B parameter multimodal model for free,, on Kaggle. Now you can train this massive 31B dense mul…

X AI KOLs Timeline · 4d ago Cached

Unsloth enables free fine-tuning of a 31B parameter multimodal model on Kaggle using 4-bit quantization, requiring only 22-24GB VRAM for local runs.

0 favorites 0 likes
#fine-tuning

A Verifiable Search Is Not a Learnable Chain-of-Thought

Hugging Face Daily Papers · 4d ago Cached

This paper demonstrates that training models on chain-of-thought demonstrations fails for tasks requiring backtracking search, showing that search procedures cannot be faithfully imitated. The authors find that even when models perform well on sub-components, they cannot carry forward a left-to-right derivation for cryptarithmetic tasks.

0 favorites 0 likes
#fine-tuning

Toward Open Weight Models Without Risks: Separating Public and Private Capabilities in LLMs

Hugging Face Daily Papers · 5d ago Cached

This paper introduces Tiered Language Models (TLMs), which allow a single set of open-weight model parameters to support multiple capability levels controlled by secret keys. The method enables selective exposure of private capabilities while preserving public model behavior and resisting extraction.

0 favorites 0 likes
#fine-tuning

@OpenAI: We also tested whether alignment persisted under pressure. The model was harder to steer toward harmful behavior with a…

X AI KOLs · 5d ago Cached

OpenAI reports that their model shows increased resistance to harmful behavior through adversarial prompting and fine-tuning, indicating improved alignment persistence under pressure.

0 favorites 0 likes
#fine-tuning

@oneill_c: 1/ We fine-tune a lot of customer models, so we decided to systematically try and figure out some best practices for fi…

X AI KOLs Following · 5d ago Cached

The thread shares systematic experimental findings on fine-tuning best practices, varying one SFT lever at a time across dense and MoE models up to 235B on four real-world customer datasets with custom evals to eliminate confounders.

0 favorites 0 likes
#fine-tuning

@MiaAI_lab: I fine-tuned Gemma 4 12B with Fable-5 style reasoning and assistant traces and released it as Gemmable 4 12b. **Availab…

X AI KOLs Timeline · 5d ago Cached

Mia-AiLab released Gemmable 4 12B, a fine-tuned version of Google's Gemma 4 12B model using Fable-5 style reasoning and assistant traces, available in GGUF and MLX formats for local inference.

0 favorites 0 likes
#fine-tuning

LocalLLaMA crowdsourced coding dataset

Reddit r/LocalLLaMA · 5d ago

A community member proposes creating a crowdsourced coding dataset for local LLMs to enable collaborative model training and fine-tuning, addressing concerns about future availability of open-weight models.

0 favorites 0 likes
#fine-tuning

PragReST: Self-Reinforcing Counterfactual Reasoning for Pragmatic Language Understanding

arXiv cs.CL · 5d ago Cached

PragReST is a self-supervised framework that improves LLM pragmatic reasoning by generating counterfactual reasoning traces and training models via supervised fine-tuning and reinforcement learning, achieving significant gains on pragmatic benchmarks without human-labeled data.

0 favorites 0 likes
Next →
← Back to home

Submit Feedback