Modeling Sparse and Bursty Vulnerability Sightings: Forecasting Under Data Constraints

Hugging Face Daily Papers 04/17/26, 12:00 AM Papers

Summary

Academic study compares SARIMAX and Poisson regression for forecasting sparse, bursty vulnerability-sighting time-series, finding count-based models more stable.

Understanding and anticipating vulnerability-related activity is a major challenge in cyber threat intelligence. This work investigates whether vulnerability sightings, such as proof-of-concept releases, detection templates, or online discussions, can be forecast over time. Building on our earlier work on VLAI, a transformer-based model that predicts vulnerability severity from textual descriptions, we examine whether severity scores can improve time-series forecasting as exogenous variables. We evaluate several approaches for short-term forecasting of sightings per vulnerability. First, we test SARIMAX models with and without log(x+1) transformations and VLAI-derived severity inputs. Although these adjustments provide limited improvements, SARIMAX remains poorly suited to sparse, short, and bursty vulnerability data. In practice, forecasts often produce overly wide confidence intervals and sometimes unrealistic negative values. To better capture the discrete and event-driven nature of sightings, we then explore count-based methods such as Poisson regression. Early results show that these models produce more stable and interpretable forecasts, especially when sightings are aggregated weekly. We also discuss simpler operational alternatives, including exponential decay functions for short forecasting horizons, to estimate future activity without requiring long historical series. Overall, this study highlights both the potential and the limitations of forecasting rare and bursty cyber events, and provides practical guidance for integrating predictive analytics into vulnerability intelligence workflows.

Original Article

View Cached Full Text

Cached at: 04/21/26, 11:27 AM

Paper page - Modeling Sparse and Bursty Vulnerability Sightings: Forecasting Under Data Constraints

Source: https://huggingface.co/papers/2604.16038 Published on Apr 17

Submitted byhttps://huggingface.co/cedricbonhomme

Cédricon Apr 21

Abstract

Forecasting vulnerability-related activities using time-series models reveals challenges with sparse, bursty data, favoring count-based methods like Poisson regression for more stable predictions.

Understanding and anticipating vulnerability-related activity is a major challenge in cyber threat intelligence. This work investigates whethervulnerability sightings, such as proof-of-concept releases, detection templates, or online discussions, can be forecast over time. Building on our earlier work onVLAI, atransformer-based modelthat predicts vulnerability severity from textual descriptions, we examine whetherseverity scorescan improvetime-series forecastingas exogenous variables. We evaluate several approaches for short-term forecasting of sightings per vulnerability. First, we testSARIMAXmodels with and without log(x+1) transformations andVLAI-derived severity inputs. Although these adjustments provide limited improvements,SARIMAXremains poorly suited to sparse, short, and bursty vulnerability data. In practice, forecasts often produce overly wide confidence intervals and sometimes unrealistic negative values. To better capture the discrete and event-driven nature of sightings, we then explore count-based methods such asPoisson regression. Early results show that these models produce more stable and interpretable forecasts, especially when sightings are aggregated weekly. We also discuss simpler operational alternatives, includingexponential decay functionsfor short forecasting horizons, to estimate future activity without requiring long historical series. Overall, this study highlights both the potential and the limitations of forecasting rare and bursty cyber events, and provides practical guidance for integrating predictive analytics into vulnerability intelligence workflows.

View arXiv page View PDF Project page GitHub2 Add to collection

Get this paper in your agent:

hf papers read 2604\.16038

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2604.16038 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2604.16038 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2604.16038 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

Modeling Sparse and Bursty Vulnerability Sightings: Forecasting Under Data Constraints

Paper page - Modeling Sparse and Bursty Vulnerability Sightings: Forecasting Under Data Constraints

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

Do Time Series Foundation Model Benchmarks Hide Regime-Dependent Failures? Evidence from Traffic Speed Forecasting

Stationarity-Aware Retrieval-Augmented Time Series Forecasting

TS-Fault: Benchmarking Time Series Forecasters Against Structural Faults

Nested Spatio-Temporal Time Series Forecasting

EnergyMamba: An Uncertainty-Aware Graph-Enhanced Selective State Space Model for Energy Consumption Prediction

Submit Feedback

Similar Articles

Do Time Series Foundation Model Benchmarks Hide Regime-Dependent Failures? Evidence from Traffic Speed Forecasting

Stationarity-Aware Retrieval-Augmented Time Series Forecasting

TS-Fault: Benchmarking Time Series Forecasters Against Structural Faults

Nested Spatio-Temporal Time Series Forecasting

EnergyMamba: An Uncertainty-Aware Graph-Enhanced Selective State Space Model for Energy Consumption Prediction