Why our #1 LightGBM feature by importance made predictions worse [D]

Reddit r/MachineLearning 06/01/26, 06:20 PM News

gradient-boosting lightgbm feature-importance target-encoding machine-learning pricing ablation

Summary

A blog post from Flyback demonstrates how a LightGBM feature that ranked #1 in importance actually worsened predictions due to target encoding leakage, highlighting the danger of relying solely on feature importance metrics.

We recently hit a classic gradient boosting trap with our pricing engine (Flyback), and I wanted to share the ablation data. We run LightGBM quantile regression to forecast secondary market watch prices. We engineered a variant-conditioned Bayesian target encoder to isolate within-reference pricing dynamics. LightGBM absolutely loved it. It ranked #1 in feature importance at q90 by a wide margin, with gains several times the next-highest feature, across all our multi seed runs. But when we ran a strict 4-seed × 3-variant ablation on the hold-out set, the results inverted. Test MAPE regressed by +0.28pp and the between-variant delta was 7x the within-variant standard deviation. The encoder was finding effective splits that completely failed to generalize because the signal it was learning was driven by irreducible label variance: unobserved factors like condition nuance, seller behavior, and timing that no feature can capture. I wrote a full post breaking down the architecture, the ablation methodology, and the mechanism behind the divergence. Happy to discuss LightGBM split mechanics, target encoding leakage, or the ablation setup. Full post and ablation results: [https://flyback.ai/engineering/target-encoding-divergence](https://flyback.ai/engineering/target-encoding-divergence)

Original Article

Similar Articles

Help interpreting metrics: a strong target text appears to induce a measurable latent-state shift in Gemma 3 12B IT

Reddit r/AI_Agents

A researcher presents evidence that strong target text can induce a measurable latent-state shift in Gemma 3 12B IT before final output, distinct from lexical or content overlaps, and discusses implications for AI safety beyond output-only evaluation.

Quantization Undoes Alignment: Bias Emergence in Compressed LLMs Across Models and Precision Levels

arXiv cs.LG

This paper studies how post-training quantization introduces new biases in instruction-tuned LLMs, finding that 3-bit precision causes 6–21% of previously unbiased items to develop stereotypes, while standard metrics like perplexity fail to detect this degradation.

Weight Pruning Amplifies Bias: A Multi-Method Study of Compressed LLMs for Edge AI

arXiv cs.LG

This study reveals a 'Smart Pruning Paradox' where activation-aware pruning methods like Wanda preserve perplexity but significantly amplify bias in Large Language Models deployed on edge devices.

Don't Collapse Your Features: Why CenterLoss Hurts OOD Detection and Multi-Scale Mahalanobis Wins

arXiv cs.LG

This paper introduces GOEN, a pipeline combining multi-scale features, L2 normalization, and Mahalanobis distance for OOD detection, and finds that CenterLoss regularization actually degrades OOD performance despite improving classification accuracy.

The Implicit Bias of Depth: From Neural Collapse to Softmax Codes

arXiv cs.LG

This paper studies how depth alone induces an implicit low-rank bias in deep unconstrained feature models trained without regularization, shifting the optimal solution from neural collapse to softmax codes, and provides the first asymptotic and dynamic characterization of this bias under gradient descent with cross-entropy loss.

Similar Articles

Help interpreting metrics: a strong target text appears to induce a measurable latent-state shift in Gemma 3 12B IT

Quantization Undoes Alignment: Bias Emergence in Compressed LLMs Across Models and Precision Levels

Weight Pruning Amplifies Bias: A Multi-Method Study of Compressed LLMs for Edge AI

Don't Collapse Your Features: Why CenterLoss Hurts OOD Detection and Multi-Scale Mahalanobis Wins

The Implicit Bias of Depth: From Neural Collapse to Softmax Codes

Submit Feedback