Tag
A user benchmarks thread count for hybrid CPU-GPU inference with Gemma 4 in llama.cpp, discovering a 80% performance uplift by using 16 threads instead of 6 on a hybrid core CPU, and shares the optimal command configuration.
Raymond Chen explains that COM STA threads are required to pump messages only when idle; code that is always busy doesn't need an explicit message loop, but COM still creates a hidden window that requires pumping when the thread becomes idle to avoid jamming window broadcasts.