fisher-information

Tag

Cards List
#fisher-information

Token Geometry

arXiv cs.LG · 22h ago Cached

The paper introduces Ember, a lightweight optimizer for embedding and LM-head matrices that exploits gradient geometry to improve efficiency and performance across supervised finetuning, RL, and pretraining, while using far less optimizer state than Adam.

0 favorites 0 likes
#fisher-information

Measuring Dead Directions: Decomposing and Classifying Singular Structure off Canonical Alignment

arXiv cs.LG · yesterday Cached

This paper presents a descent-free and alignment-free method to measure singular structure in trained neural networks. It recovers the order of dead directions from the directional Fisher rate, classifying genuine singularities from flat gauge symmetries, and demonstrates the technique on transformer and convolutional layers.

0 favorites 0 likes
#fisher-information

The Degeneracy Distillery

arXiv cs.LG · 2026-06-24 Cached

This paper introduces the degeneracy distillery, a method that automatically detects and resolves degenerate parameter combinations in physical models by estimating and flattening the Fisher information matrix, reducing the simulation budget required for neural posterior estimation while providing physical insight.

0 favorites 0 likes
#fisher-information

Fisher Width: A Geometric Measure of Complexity on Statistical Manifolds

arXiv cs.LG · 2026-06-18 Cached

Introduces Fisher width, a Riemannian analogue of Gaussian width for statistical manifolds, which captures local statistical curvature and is invariant under reparameterization. The paper develops its theory, proves generalization bounds for Fisher-Lipschitz classes, and demonstrates computable estimators on MNIST.

0 favorites 0 likes
#fisher-information

Measuring Model Robustness via Fisher Information: Spectral Bounds, Theoretical Guarantees, and Practical Algorithms

Hugging Face Daily Papers · 2026-06-03 Cached

The paper proposes an attack-agnostic robustness metric based on the spectral norm of the Fisher Information Matrix, providing theoretical bounds and scalable evaluation methods for deep neural networks.

0 favorites 0 likes
#fisher-information

FoRA: Fisher-orthogonal Rank Adaptation for Parameter-Efficient Fine-Tuning

arXiv cs.CL · 2026-05-29 Cached

FoRA introduces a parameter-efficient fine-tuning method that selects task-informative layers via Fisher scores and trains LoRA down-projections on the Stiefel manifold, reducing parameters while preserving accuracy.

0 favorites 0 likes
#fisher-information

Most injection detectors score each prompt in isolation. I built one that tracks the geometric trajectory of the full session. Here is a concrete result.

Reddit r/artificial · 2026-04-20

A developer built Arc Gate, a monitoring proxy for LLMs that uses Fisher information manifold geometry to detect session-level prompt injection attacks, identifying Crescendo-style gradual manipulation by tracking t-values against a phase transition threshold t* = 1.2247 rather than per-turn phrase detection.

0 favorites 0 likes
← Back to home

Submit Feedback