Tag
AstraFlow is a dataflow-oriented RL system that enables efficient multi-policy collaborative training and elastic scaling for agentic LLMs, achieving a 2.7x training speedup over existing systems.
Describes a training technique involving spike-aware pedagogy rewards that penalize implausible jumps, and surprisal-gated imitation where the student learns easy tokens quickly and hard ones slowly.
Stanford NLP promotes the CS336 course as a path to learning the tricks of successfully training state-of-the-art language models.
MinT is a managed infrastructure system that enables efficient training and serving of millions of LLMs by keeping base models resident and moving lightweight LoRA adapters, scaling across model architectures, storage, and policy management.
A developer trained a 350M-parameter model capable of navigating spreadsheets better than Anthropic's Opus 4.6.
The author asks about the current viability of AMD's ROCm ecosystem for AI training in mid-2026, comparing it to NVIDIA's CUDA and asking if it has reached a 'just works' stage for PyTorch.
OpenAI explains how ChatGPT learns from public data and user interactions while protecting privacy through filtering and user controls.
Transition is an AI-powered coaching platform designed to optimize athletic training routines and improve race performance for runners.
Google unveils 8th-gen TPUs: TPU 8t for training and TPU 8i for inference, purpose-built for power-efficient, large-scale AI agent workloads and arriving later this year.
A user is seeking advice on improving their object detection model trained with YOLO11n for deployment on a Raspberry Pi 5, struggling with the gap between theoretical mAP50 metrics and practical detection performance.
Kaizen is a training platform that dynamically adapts running workouts based on user performance and activity data.
Ulysses Sequence Parallelism is a technique for training LLMs with million-token contexts by distributing sequence chunks across GPUs, reducing memory requirements and enabling efficient long-context training. It integrates with HuggingFace Accelerate, Transformers Trainer, and TRL, with support for Flash Attention and DeepSpeed ZeRO.
OpenAI has launched the OpenAI Academy for News Organizations, a learning hub offering on-demand training, playbooks, and practical AI use cases for journalists and publishers, developed in partnership with the American Journalism Project and The Lenfest Institute.
This live tutorial demonstrates how to fine-tune a small code agent (Gemma 4 2B) on an agent trace dataset using supervised fine-tuning (SFT), and automate hyperparameter sweeps and evaluation using HF Jobs and Track IO, embodying the concept of "using agents to train agents."
An explorer uses ChatGPT as a virtual assistant to plan a solo, unsupported cycling trip to the South Pole, and refines preparation through weight reduction, troubleshooting, and weight training.
A GitHub repository providing code to train large language models from scratch using PyTorch, based on the Attention Is All You Need paper, with support for billion-parameter models on a single GPU.
Anthropic shares lessons from improving Claude's alignment training, achieving perfect scores on agentic misalignment evaluations by teaching underlying principles rather than just demonstrations.