continued-pre-training

#continued-pre-training

Teaching LLMs a Low-Resource Language: Enhancing Code Completion in Pharo

Hugging Face Daily Papers ↗ · 2026-07-06 Cached

This paper presents an end-to-end pipeline for adapting LLMs to low-resource programming languages, using Pharo as a case study, achieving superior code completion with small models suitable for real-time IDE support.

0 favorites 0 likes

#continued-pre-training

Predictable Scaling Laws of Optimal Hyperparameters for LLM Continued Pre-training

arXiv cs.CL ↗ · 2026-06-05 Cached

This paper discovers predictable scaling laws for optimal hyperparameters (learning rate, batch size) in LLM continued pre-training, proposing a two-stage framework that reduces hyperparameter search overhead by up to 90% while maintaining performance.

0 favorites 0 likes

#continued-pre-training

Freeze Deep, Train Shallow: Interpretable Layer Allocation for Continued Pre-Training

arXiv cs.CL ↗ · 2026-05-13 Cached

This paper proposes LayerTracer, an interpretable framework for layer allocation in continued pre-training, demonstrating that freezing deep layers while training shallow ones outperforms full-parameter fine-tuning. It offers a low-cost, actionable strategy for resource-constrained teams optimizing Large Language Models.

0 favorites 0 likes

continued-pre-training

Teaching LLMs a Low-Resource Language: Enhancing Code Completion in Pharo

Predictable Scaling Laws of Optimal Hyperparameters for LLM Continued Pre-training

Freeze Deep, Train Shallow: Interpretable Layer Allocation for Continued Pre-Training

Submit Feedback