Tag
Discusses how AI systems often trust sensor inputs without validation, using an example of a logistics company where spoofed temperature sensor data led to cargo damage, and questions whether AI can detect such spoofing.
This paper introduces FragileFlow, a plug-in regularizer that improves the robustness of LLMs and VLMs by controlling 'correct-but-fragile' predictions through spectral analysis and PAC-Bayes bounds.
The paper introduces PASA, a robust watermarking algorithm for LLM-generated text that operates at the semantic level using latent embedding spaces to resist semantic-invariant attacks like paraphrasing.
This article explains how incorporating Shannon entropy into reinforcement learning objectives creates more robust agents capable of handling unexpected or adversarial changes in rewards and dynamics.