system-optimization

Tag

Cards List
#system-optimization

EMA: Efficient Model Adaptation for Learning-based Systems

arXiv cs.LG · 2026-05-15 Cached

This paper presents EMA, a model adaptation system for learning-based systems that reduces training and labeling costs while improving system performance in evolving environments.

0 favorites 0 likes
#system-optimization

DisagMoE: Computation-Communication overlapped MoE Training via Disaggregated AF-Pipe Parallelism

arXiv cs.LG · 2026-05-13 Cached

This paper introduces DisagMoE, a system for MoE training that optimizes computation-communication overlap by disaggregating attention and FFN layers across GPU groups. Implemented on Megatron-LM, it achieves up to 1.8x speedup on H800 clusters by addressing inter-node communication bottlenecks.

0 favorites 0 likes
#system-optimization

@Suryanshti777: NVIDIA just revealed the hidden tricks they’re using to make LLM fine-tuning dramatically faster. Not new GPUs. Not big…

X AI KOLs Timeline · 2026-05-07

NVIDIA and Unsloth have published a technical guide detailing three low-level optimizations that can accelerate LLM fine-tuning by up to 25%, including packed-sequence caching, double-buffered checkpointing, and optimized MoE routing. The guide provides deep systems-level explanations and benchmarks aimed at ML engineers and developers.

0 favorites 0 likes
#system-optimization

Running OpenClaw 24/7 "Always Free" (Non-Oracle VPS options?)

Reddit r/openclaw · 2026-05-07

The author seeks alternatives to Oracle Cloud for hosting a 24/7 OpenClaw instance on an 'Always Free' tier, discussing options like Google Cloud e2-micro and Fly.io, and asking for optimization tips to run within 1GB RAM.

0 favorites 0 likes
#system-optimization

TokenSpeed: A Speed-of-Light LLM Inference Engine for Agentic Workloads (5 minute read)

TLDR AI · 2026-05-07 Cached

Lightseek releases TokenSpeed, a high-performance LLM inference engine optimized for agentic workloads, featuring compiler-backed parallelism and advanced kernel optimizations that have been adopted by vLLM.

0 favorites 0 likes
#system-optimization

AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning

Papers with Code Trending · 2025-05-30 Cached

AReaL is a fully asynchronous reinforcement learning system for LLM reasoning, achieving up to 2.57x training speedup over synchronous systems while maintaining or improving performance. It decouples generation and training to improve GPU utilization and includes optimizations like staleness-enhanced PPO.

0 favorites 0 likes
#system-optimization

Greedeks/GTweak

GitHub Trending (daily) · 2026-05-13 Cached

GTweak is an open-source Windows system optimization and privacy tool that allows users to disable telemetry, updates, and unnecessary services while activating Windows.

0 favorites 0 likes
← Back to home

Submit Feedback