probing

#probing

Probing in the Wild: A Case Study of Self-Supervised Speech Representations on Mandarin Sub-dialects with Unsupervised Articulatory Analysis

arXiv cs.CL ↗ · 2026-06-25 Cached

This paper presents a case study using unsupervised articulatory probing to examine how self-supervised speech models encode phonetic features across Mandarin sub-dialects, finding that salient features like labiality remain stable while finer spectral distinctions show dialect-dependent variation.

0 favorites 0 likes

#probing

MemTrace: Probing What Final Accuracy Misses in Long-Term Memory

arXiv cs.AI ↗ · 2026-06-17 Cached

MemTrace is a benchmark that evaluates LLM agent memory at the knowledge point level, probing how facts behave under varying memory age, question type, and evidence conditions. It reveals that pooled accuracy hides distinct failure modes, and that the main bottleneck is evidence use rather than retrieval.

0 favorites 0 likes

#probing

Vernier: Probing Representational Misalignment Behind Lexical Gaps in Causal Reasoning

arXiv cs.CL ↗ · 2026-06-16 Cached

This paper investigates why instruction-tuned language models give different answers to causal reasoning questions when variable names are replaced with placeholders, finding that the issue stems from representational misalignment rather than information loss. The authors introduce Vernier, a method using paired-view weight updates and mechanism inspection to reveal that answer-relevant content is still present in the placeholder view but misaligned.

0 favorites 0 likes

#probing

When Probing Accuracy Saturates, Fragility Resolves: A Complementary Metric for LLM Pre-Training Analysis

arXiv cs.CL ↗ · 2026-06-11 Cached

This paper introduces 'fragility', a complementary metric to probe accuracy that measures activation-noise level at which probe accuracy collapses, enabling analysis of representation evolution during LLM pre-training even after accuracy saturates.

0 favorites 0 likes

#probing

Don't let the LLM speak, just probe it (8 minute read)

TLDR AI ↗ · 2026-06-11 Cached

The article introduces a technique that extracts hidden states from an LLM at the last prompt token to perform classification without text generation, using a small MLP to read the model's internal decision, enabling fast and cheap zero-shot classifiers.

0 favorites 0 likes

#probing

Hallucination Is Linearly Decodable from Mid-Layer Hidden States in Quantized LLMs

arXiv cs.LG ↗ · 2026-06-03 Cached

This paper investigates whether open-source quantized LLMs encode a linearly separable truthfulness signal in their hidden states. Across three 7B-8B instruction-tuned models, a linear probe on a single mid-network layer achieves 0.904-1.000 AUROC on hallucination detection benchmarks, outperforming sampling-based methods.

0 favorites 0 likes

#probing

When and How Long? The Readout-Mediator Angle in Temporal Reasoning

arXiv cs.LG ↗ · 2026-05-29 Cached

This paper introduces the readout-mediator angle to demonstrate that linear probes can decode information from language model activations that is orthogonal to the model's actual causal computation, undermining probe-based interpretability. The finding replicates across model scales and families, revealing a fundamental failure mode in using probes for mechanistic understanding or safety monitoring.

0 favorites 0 likes

#probing

Probing Minimalist Phase Structure in LLMs: What Universal Dependencies Cannot Represent

arXiv cs.CL ↗ · 2026-05-27 Cached

This paper investigates whether large language models encode syntactic abstractions like phase boundaries that are not captured by Universal Dependencies, using structural probes on wh-movement stimuli with invariant UD distances, finding evidence across 13 LLMs for phase-structure representations that are causally active.

0 favorites 0 likes

#probing

Which Pretraining Paradigm Better Serves Spatial Intelligence? An Empirical Comparison of Vision-Language and Video Generation Models

Hugging Face Daily Papers ↗ · 2026-05-27 Cached

This paper presents a systematic frozen-feature probing study comparing vision-language models (VLMs) and video generation models (VGMs) on spatial intelligence tasks. It finds that VLMs excel at semantic tagging and instance grouping, while VGMs provide better dense geometry and camera motion signals, and a naive fusion of both yields strong performance across all axes.

0 favorites 0 likes

#probing

Polar probe linearly decodes semantic structures from LLMs

arXiv cs.CL ↗ · 2026-05-15 Cached

This paper proposes a Polar Probe that linearly recovers semantic structures from LLM activations by representing entity relations through distance and direction in a learned subspace. Testing across arithmetic, visual scenes, family trees, metro maps, and social interactions shows the code emerges in middle layers, generalizes to new entities, and causally influences model predictions.

0 favorites 0 likes

#probing

LLM Agents Already Know When to Call Tools -- Even Without Reasoning

Hugging Face Daily Papers ↗ · 2026-05-10 Cached

This paper introduces When2Tool, a benchmark to study when LLM agents actually need to call tools, and reveals that models already know tool necessity from hidden states but fail to act. The proposed Probe&Prefill method reduces unnecessary tool calls by 48% with minimal accuracy loss.

0 favorites 0 likes

#probing

Probing for Reading Times

arXiv cs.CL ↗ · 2026-04-22 Cached

Researchers probe language model representations to predict human reading times across five languages, finding early layers outperform surprisal for early-pass measures while surprisal remains superior for late-pass measures.

0 favorites 0 likes

probing

Submit Feedback