Tag
This paper introduces LogMILP, a weakly-supervised framework for log instance anomaly localization that uses prototype-guided structural modeling and counterfactual perturbation consistency regularization to improve detection and interpretability with only bag-level labels.
This paper argues that log analysis is essential for credible AI agent evaluation, as outcome-only benchmarks often fail to reveal underlying capabilities, safety risks, or failure modes.