Why our #1 LightGBM feature by importance made predictions worse [D]
Summary
A blog post from Flyback demonstrates how a LightGBM feature that ranked #1 in importance actually worsened predictions due to target encoding leakage, highlighting the danger of relying solely on feature importance metrics.
Similar Articles
Help interpreting metrics: a strong target text appears to induce a measurable latent-state shift in Gemma 3 12B IT
A researcher presents evidence that strong target text can induce a measurable latent-state shift in Gemma 3 12B IT before final output, distinct from lexical or content overlaps, and discusses implications for AI safety beyond output-only evaluation.
Quantization Undoes Alignment: Bias Emergence in Compressed LLMs Across Models and Precision Levels
This paper studies how post-training quantization introduces new biases in instruction-tuned LLMs, finding that 3-bit precision causes 6–21% of previously unbiased items to develop stereotypes, while standard metrics like perplexity fail to detect this degradation.
Weight Pruning Amplifies Bias: A Multi-Method Study of Compressed LLMs for Edge AI
This study reveals a 'Smart Pruning Paradox' where activation-aware pruning methods like Wanda preserve perplexity but significantly amplify bias in Large Language Models deployed on edge devices.
Don't Collapse Your Features: Why CenterLoss Hurts OOD Detection and Multi-Scale Mahalanobis Wins
This paper introduces GOEN, a pipeline combining multi-scale features, L2 normalization, and Mahalanobis distance for OOD detection, and finds that CenterLoss regularization actually degrades OOD performance despite improving classification accuracy.
The Implicit Bias of Depth: From Neural Collapse to Softmax Codes
This paper studies how depth alone induces an implicit low-rank bias in deep unconstrained feature models trained without regularization, shifting the optimal solution from neural collapse to softmax codes, and provides the first asymptotic and dynamic characterization of this bias under gradient descent with cross-entropy loss.