vision-transformers

Tag

Cards List
#vision-transformers

UniverSat: Resolution- and Modality-Agnostic Transformers for Earth Observation

Hugging Face Daily Papers · 2026-06-22 Cached

UniverSat introduces a Universal Patch Encoder for Vision Transformers that enables robust, sensor-agnostic spatial feature extraction across diverse Earth Observation data types, achieving strong results on classification and segmentation benchmarks.

0 favorites 0 likes
#vision-transformers

ViT-Up: Faithful Feature Upsampling for Vision Transformers

Hugging Face Daily Papers · 2026-06-12 Cached

ViT-Up introduces a task-agnostic feature upsampler for Vision Transformers that predicts features at arbitrary continuous image coordinates, enabling dense feature maps at any resolution and improving dense prediction and semantic correspondence benchmarks. It outperforms prior state-of-the-art upsamplers, with gains of up to +2.07 mIoU on Cityscapes and +4.17 [email protected] on SPair-71k.

0 favorites 0 likes
#vision-transformers

Phase Marginalization for Patch-Grid Instability in Vision Transformers

Hugging Face Daily Papers · 2026-06-06 Cached

Phase Marginalization is a post-hoc method that addresses phase-dependent instability in Vision Transformers by evaluating structured patch-grid phases and aggregating outputs. It improves segmentation, depth, and local matching over the canonical baseline with minimal extra cost.

0 favorites 0 likes
#vision-transformers

Elastic Attention Cores for Scalable Vision Transformers [R]

Reddit r/MachineLearning · 2026-05-13

This article presents a new paper on Elastic Attention Cores for Vision Transformers, proposing a core-periphery block-sparse attention structure that improves scalability and accuracy compared to dense self-attention methods like DINOv3.

0 favorites 0 likes
← Back to home

Submit Feedback