language-model-training

Tag

Cards List
#language-model-training

@ChengleiSi: Excited to share these preliminary results on our internal autoresearch system @Recursive_SI, where we achieve SOTA on …

X AI KOLs Following · 21h ago Cached

Recursive's automated AI research system achieves state-of-the-art results on NanoChat, NanoGPT Speedrun, and GPU kernel benchmarks by automating the research loop without task-specific adaptations, and open-sourcing artifacts for further inspection.

0 favorites 0 likes
#language-model-training

@Recursive_SI: https://x.com/Recursive_SI/status/2064980090702962699

X AI KOLs Timeline · yesterday Cached

Recursive releases early results from its automated AI research system, achieving state-of-the-art in fixed-budget language model training, small-model training speed, and GPU kernel optimization, and open-sources artifacts.

0 favorites 0 likes
#language-model-training

Decoupling the Benefits of Subword Tokenization for Language Model Training via Byte-level Simulation

Hugging Face Daily Papers · 2026-05-14 Cached

This paper investigates the impact of subword tokenization on LLM training efficiency and performance by conducting controlled byte-level pretraining experiments. It reveals key factors such as training throughput and the integration of subword boundaries as linguistic priors.

0 favorites 0 likes
← Back to home

Submit Feedback