drift-detection

#drift-detection

PRISM: Prompt Reliability via Iterative Simulation and Monitoring for Enterprise Conversational AI

arXiv cs.AI ↗ · 2026-05-18 Cached

PRISM is a closed-loop framework that treats prompt engineering as a continuous reliability problem for enterprise conversational AI. It automates test generation, simulation, evaluation, and repair, achieving 99% reliability and reducing authoring time from days to minutes.

0 favorites 0 likes

#drift-detection

Pitfalls of Unlabeled Disagreement-Based Drift Detection in Streaming Tree Ensembles

arXiv cs.LG ↗ · 2026-05-14 Cached

This paper investigates disagreement-based drift detection in ensembles of incremental decision trees, finding that while effective in neural networks, the method underperforms loss-based detectors for tree ensembles due to limited model plasticity.

0 favorites 0 likes

#drift-detection

We started measuring "undeclared-intent spend" in agent workflows

Reddit r/AI_Agents ↗ · 2026-05-11

The article discusses measuring 'undeclared-intent spend' in agent workflows, quantifying compute tokens spent outside the declared intent to reveal behavioral costs like drift and off-task execution.

0 favorites 0 likes

#drift-detection

The Geometric Canary: Predicting Steerability and Detecting Drift via Representational Stability

Hugging Face Daily Papers ↗ · 2026-04-20 Cached

This paper introduces geometric stability measures—based on pairwise distance consistency in representations—to predict language model steerability and detect structural drift. Supervised variants achieve near-perfect correlation (ρ=0.89-0.97) with linear steerability across 35-69 embedding models, while unsupervised variants outperform CKA and Procrustes for post-deployment drift detection.

0 favorites 0 likes

drift-detection

PRISM: Prompt Reliability via Iterative Simulation and Monitoring for Enterprise Conversational AI

Pitfalls of Unlabeled Disagreement-Based Drift Detection in Streaming Tree Ensembles

We started measuring "undeclared-intent spend" in agent workflows

The Geometric Canary: Predicting Steerability and Detecting Drift via Representational Stability

Submit Feedback