@ChrisInterno: Signals of physical plausibility are hiding in the geometry of frozen image encoders. No video training. No physics sup…
Summary
The tweet highlights a research finding that signals of physical plausibility can be extracted from the geometry of frozen image encoders without video training or physics supervision.
View Cached Full Text
Cached at: 06/23/26, 01:50 PM
Signals of physical plausibility are hiding in the geometry of frozen image encoders. No video training. No physics supervision. https://t.co/NKmgD8g53f
Similar Articles
Building The Ph(ysical)AI Layer Of Machine Intelligence
Researchers at MIT Lincoln Laboratory propose 'principle-driven foundation models' that encode signal-theoretic physical principles (Fourier decomposition, energy conservation, symmetry) instead of learning statistical correlations from large paired datasets. Trained exclusively on RF data, their 1.99M parameter frozen encoder achieves 77.7% average accuracy across 15 diverse tasks spanning audio, images, text, and video without any fine-tuning on target domains.
@rohanpaul_ai: Frozen LLMs still carry readable behavior signals deep inside their hidden states. And Proprioceptive AI has created Cy…
Proprioceptive AI released Cygnus, a tool that equips frozen LLMs with self-sensing adapters reading internal hidden states via gl(4,R) Lie algebra to isolate dark modes, boosting Qwen-32B's ARC-Challenge score from 82.2% to 94.97% on a single RTX 3090 without retraining.
Physics in 2-Steps: Locking Motion Priors Before Visual Refinement Erases Them
PhaseLock is a training-free framework that preserves motion priors from early-step inference to improve physical consistency in image-to-video diffusion models, achieving 6.2 point improvement with minimal overhead.
EgoPhys: Learning Generalizable Physics Models of Deformable Objects from Egocentric Video
EgoPhys introduces a framework to construct deformable physical digital twins from egocentric RGB video using generalizable priors and a compact codebook, enabling zero-shot generalization to unseen objects without per-spring optimization. The system is demonstrated on a real robot, showing that egocentric human play video can serve as internal world representation for deformable-object planning.
HumanScale: Egocentric Human Video Can Outperform Real-Robot Data for Embodied Pretraining
This paper finds that egocentric human video, when processed with a filtering and labeling pipeline, can outperform teleoperated real-robot data for pretraining embodied foundation models, achieving lower validation loss and higher success rates on real-robot tasks.