training

#training

AstraFlow: Dataflow-Oriented Reinforcement Learning for Agentic LLMs

Hugging Face Daily Papers ↗ · 2026-05-15 Cached

AstraFlow is a dataflow-oriented RL system that enables efficient multi-policy collaborative training and elastic scaling for agentic LLMs, achieving a 2.7x training speedup over existing systems.

0 favorites 0 likes

#training

@SOURADIPCHAKR18: Two things make this work. 1. Spike-aware pedagogy rewards: only reward the model for being correct AND plausible. Puni…

X AI KOLs Following ↗ · 2026-05-14 Cached

Describes a training technique involving spike-aware pedagogy rewards that penalize implausible jumps, and surprisal-gated imitation where the student learns easy tokens quickly and hard ones slowly.

0 favorites 0 likes

#training

@stanfordnlp: There are two paths to learning the details (aka “tricks” or “secrets”) of successfully training state-of-the-art langu…

X AI KOLs Following ↗ · 2026-05-13 Cached

Stanford NLP promotes the CS336 course as a path to learning the tricks of successfully training state-of-the-art language models.

0 favorites 0 likes

#training

MinT: Managed Infrastructure for Training and Serving Millions of LLMs

Hugging Face Daily Papers ↗ · 2026-05-13 Cached

MinT is a managed infrastructure system that enables efficient training and serving of millions of LLMs by keeping base models resident and moving lightweight LoRA adapters, scaling across model architectures, storage, and policy management.

0 favorites 0 likes

#training

@eglyman: we trained a .35b-parameter model to navigate spreadsheets better than opus 4.6. normal corporate card company stuff.

X AI KOLs Following ↗ · 2026-05-07 Cached

A developer trained a 350M-parameter model capable of navigating spreadsheets better than Anthropic's Opus 4.6.

0 favorites 0 likes

#training

ROCm Status in mid 2026 [D]

Reddit r/MachineLearning ↗ · 2026-05-07

The author asks about the current viability of AMD's ROCm ecosystem for AI training in mid-2026, comparing it to NVIDIA's CUDA and asking if it has reached a 'just works' stage for PyTorch.

0 favorites 0 likes

#training

How ChatGPT learns about the world while protecting privacy

OpenAI Blog ↗ · 2026-05-06 Cached

OpenAI explains how ChatGPT learns from public data and user interactions while protecting privacy through filtering and user controls.

0 favorites 0 likes

#training

Transition

Product Hunt ↗ · 2026-05-05

Transition is an AI-powered coaching platform designed to optimize athletic training routines and improve race performance for runners.

0 favorites 0 likes

#training

Our eighth generation TPUs: two chips for the agentic era

Hacker News Top ↗ · 2026-04-22 Cached

Google unveils 8th-gen TPUs: TPU 8t for training and TPU 8i for inference, purpose-built for power-efficient, large-scale AI agent workloads and arriving later this year.

0 favorites 0 likes

#training

What should i do to have a good OD model?[P]

Reddit r/MachineLearning ↗ · 2026-04-20

A user is seeking advice on improving their object detection model trained with YOLO11n for deployment on a Raspberry Pi 5, struggling with the gap between theoretical mAP50 metrics and practical detection performance.

0 favorites 0 likes

#training

kaizen

Product Hunt ↗ · 2026-04-16

Kaizen is a training platform that dynamically adapts running workouts based on user performance and activity data.

0 favorites 0 likes

#training

Ulysses Sequence Parallelism: Training with Million-Token Contexts

Hugging Face Blog ↗ · 2026-03-09 Cached

Ulysses Sequence Parallelism is a technique for training LLMs with million-token contexts by distributing sequence chunks across GPUs, reducing memory requirements and enabling efficient long-context training. It integrates with HuggingFace Accelerate, Transformers Trainer, and TRL, with support for Flash Attention and DeepSpeed ZeRO.

0 favorites 0 likes

#training

Introducing OpenAI Academy for News Organizations

OpenAI Blog ↗ · 2025-12-17 Cached

OpenAI has launched the OpenAI Academy for News Organizations, a learning hub offering on-demand training, playbooks, and practical AI use cases for journalists and publishers, developed in partnership with the American Journalism Project and The Lenfest Institute.

0 favorites 0 likes

#training

Training Agents: Live tutorial on how to fine-tune a coding agent for continual learning

YouTube AI Channels ↗ · 3d ago Cached

This live tutorial demonstrates how to fine-tune a small code agent (Gemma 4 2B) on an agent trace dataset using supervised fine-tuning (SFT), and automate hyperparameter sweeps and evaluation using HF Jobs and Track IO, embodying the concept of "using agents to train agents."

0 favorites 0 likes

#training

Training to cycle across Antarctica | with ChatGPT

YouTube AI Channels ↗ · 2026-06-12 Cached

An explorer uses ChatGPT as a virtual assistant to plan a solo, unsupported cycling trip to the South Pole, and refines preparation through weight reduction, troubleshooting, and weight training.

0 favorites 0 likes

#training

FareedKhan-dev/train-llm-from-scratch

GitHub Trending (daily) ↗ · 2026-05-30 Cached

A GitHub repository providing code to train large language models from scratch using PyTorch, based on the Attention Is All You Need paper, with support for billion-parameter models on a single GPU.

0 favorites 0 likes

#training

May 8, 2026AlignmentTeaching Claude why

Anthropic Research ↗ · 2026-05-08 Cached

Anthropic shares lessons from improving Claude's alignment training, achieving perfect scores on agentic misalignment evaluations by teaching underlying principles rather than just demonstrations.

0 favorites 0 likes

training

Submit Feedback