hidden-state

#hidden-state

Don't let the LLM speak, just probe it (8 minute read)

TLDR AI ↗ · 22h ago Cached

The article introduces a technique that extracts hidden states from an LLM at the last prompt token to perform classification without text generation, using a small MLP to read the model's internal decision, enabling fast and cheap zero-shot classifiers.

0 favorites 0 likes

#hidden-state

AERIC: Anticipatory Hidden-State Monitoring for Implicit Harmful Dialogue

arXiv cs.CL ↗ · 2026-05-26 Cached

Introduces AERIC, a lightweight hidden-state monitoring method for detecting implicit harmful content in LLM dialogue without extra forward passes, achieving improved AUROC over strong baselines with minimal latency overhead.

0 favorites 0 likes

#hidden-state

Where Should Diffusion Enter a Language Model? Geometry-Guided Hidden-State Replacement

arXiv cs.CL ↗ · 2026-05-15 Cached

This paper introduces DiHAL, a diffusion-transformer hybrid that uses geometry-based proxies to select a layer in a pretrained language model for hidden-state replacement with a diffusion bridge, improving continuous diffusion language modeling by avoiding direct token recovery.

0 favorites 0 likes

hidden-state

Don't let the LLM speak, just probe it (8 minute read)

AERIC: Anticipatory Hidden-State Monitoring for Implicit Harmful Dialogue

Where Should Diffusion Enter a Language Model? Geometry-Guided Hidden-State Replacement

Submit Feedback