large-scale

#large-scale

TabLoRA: Parameter-Efficient Low-Rank Ensemble Learning for Large-Scale Tabular Data

arXiv cs.LG ↗ · 5d ago Cached

TabLoRA proposes a parameter-efficient neural ensemble method for large-scale tabular data by sharing a common backbone with predictor-specific low-rank adaptations, achieving competitive performance against GBDTs and deep learning baselines.

0 favorites 0 likes

#large-scale

@samsja19: https://x.com/samsja19/status/2076846033922035818

X AI KOLs Following ↗ · 5d ago Cached

PRIME-RL is a framework for large-scale asynchronous reinforcement learning, designed to be hackable and scale to 1000+ GPUs with support for various models and environments.

0 favorites 0 likes

#large-scale

Robust Feasible Route Construction through Collaborative Partition Optimization

arXiv cs.AI ↗ · 2026-07-07 Cached

This paper introduces Collaborative Routing Constructors (CoRC), a framework that enables independently solved subproblems to exchange customers and vehicles during optimization, improving feasibility and scalability for large-scale Capacitated Vehicle Routing Problems.

0 favorites 0 likes

#large-scale

@h100envy: Ex-Berkeley PhD who leads SGLang at xAI explained how they serve Grok on 100K GPUs in 23 minutes - better than $2000 in…

X AI KOLs Timeline ↗ · 2026-07-06 Cached

A former Berkeley PhD who leads SGLang at xAI explains how they serve Grok on 100K GPUs using split prefill/decode, expert sharding, and communication/computation overlap to achieve DeepSeek-API-killing prices.

0 favorites 0 likes

#large-scale

Miles: A PyTorch-Native Stack for Large-Scale LLM RL Post-Training (14 minute read)

TLDR AI ↗ · 2026-07-01 Cached

Miles is an open-source PyTorch-native framework from RadixArk for large-scale LLM reinforcement learning post-training, integrating SGLang, Megatron-LM, and Ray for high-throughput rollout and distributed training.

0 favorites 0 likes

#large-scale

LongCat-2.0, a large-scale MoE model with 1.6T total and 48B Active

Hacker News Top ↗ · 2026-06-30

LongCat-2.0 is a large-scale Mixture-of-Experts (MoE) model with 1.6 trillion total parameters and 48 billion active parameters.

0 favorites 0 likes

#large-scale

One-Step Gradient Delay is Not a Barrier for Large-Scale Asynchronous Pipeline Parallel LLM Pretraining

Hugging Face Daily Papers ↗ · 2026-06-29 Cached

This paper challenges the assumption that one-step gradient delay in asynchronous pipeline parallelism is inherently unstable, showing that degradation depends on optimizer choice. It demonstrates that optimizers like Muon are robust to one-step delay and introduces an error-feedback correction to further mitigate staleness, achieving near-synchronous performance in LLM pretraining up to 10B parameters.

0 favorites 0 likes

#large-scale

DF3DV-1K: A Large-Scale Dataset and Benchmark for Distractor-Free Novel View Synthesis

Hugging Face Daily Papers ↗ · 2026-06-18 Cached

Introduces DF3DV-1K, a large-scale real-world dataset with 1,048 scenes and 89,924 images for distractor-free novel view synthesis, along with a benchmark of nine methods and an application improving radiance field methods via fine-tuning a diffusion-based 2D enhancer.

0 favorites 0 likes

#large-scale

100 Trillion+ Pretraining data??? This is the largest data I've see a model being trained on.

Reddit r/LocalLLaMA ↗ · 2026-06-01

A new AI model is being trained on over 100 trillion tokens, doubling the typical pretraining data size of 27-50 trillion tokens used by other models like Kimi, Mimo, and DeepSeek.

0 favorites 0 likes

#large-scale

@jcjohnss: GPIC should be the new standard benchmark for generative modeling. Training 1 epoch on GPIC is the same cost as 100 epo…

X AI KOLs Following ↗ · 2026-05-29 Cached

GPIC is a new large-scale image-text dataset and benchmark for generative modeling, claimed to be much more efficient than ImageNet and a better proxy for real-world problems, with fully permissive licensing for research and commercial use.

0 favorites 0 likes

#large-scale

@josefchen: Launching our new paper on arXiv: we trained the largest multilingual food model ever built. 4.1M recipes. 7 languages.…

X AI KOLs Timeline ↗ · 2026-05-26 Cached

New arXiv paper announces the largest multilingual food model, trained on 4.1M recipes across 7 languages with 1,790 ingredients, compressed into 2MB.

0 favorites 0 likes

#large-scale

SciAtlas: A Large-Scale Knowledge Graph for Automated Scientific Research

arXiv cs.AI ↗ · 2026-05-25 Cached

SciAtlas is a large-scale, multi-disciplinary academic knowledge graph containing over 43 million papers and 3 billion triplets, designed to provide structured knowledge for AI-driven automated scientific research with a neuro-symbolic retrieval algorithm.

0 favorites 0 likes

#large-scale

Ring-2.6-1T is putting up SOTA-level numbers for real-world agents

Reddit r/ArtificialInteligence ↗ · 2026-05-18

Ant Group released Ring-2.6-1T, a 1 trillion parameter reasoning model for agent workflows, featuring MIT license, extended context, and Async RL + IcePop training, achieving state-of-the-art results.

0 favorites 0 likes

#large-scale

@kevin_x_li: Introducing SWE-ZERO-12M-trajectories: the largest agentic trace dataset in the open, 5.7x larger than the previous lar…

X AI KOLs Following ↗ · 2026-05-13 Cached

SWE-ZERO-12M-trajectories is the largest open agentic trace dataset for coding, with 112B tokens across 12M trajectories from 122K pull requests and 3K repositories, enabling scalable training of agentic coding models without requiring containerized execution.

0 favorites 0 likes

#large-scale

Urban-ImageNet: A Large-Scale Multi-Modal Dataset and Evaluation Framework for Urban Space Perception

Hugging Face Daily Papers ↗ · 2026-05-11 Cached

Urban-ImageNet is a large-scale multi-modal dataset and evaluation benchmark for urban space perception from social media imagery, supporting scene classification, cross-modal retrieval, and instance segmentation tasks across 61 urban sites in 24 Chinese cities.

0 favorites 0 likes

large-scale

Submit Feedback