Tag
Discussion about whether ML teams are actually testing model security risks like extraction and poisoning in production, noting that security review for models lags behind regular software.
POISE is a stealthy skill-poisoning attack that embeds malicious triggers within benign-looking instructions, achieving high attack success rates while evading detection by LLM scanners.