RaysUp: Ultra-light Universal Feature Upsampling via Geometry-Aware Ray Representation

Hugging Face Daily Papers 06/22/26, 12:00 AM Papers

Summary

RaysUp is an ultra-lightweight, task-agnostic feature upsampling framework that uses geometry-aware ray domain techniques to reconstruct high-resolution features from low-resolution VFM outputs, achieving state-of-the-art performance with 84% fewer parameters than prior work and 7x faster inference.

Pre-trained Vision Foundation Models (VFMs) have become central to modern computer vision due to their powerful semantic representations and strong generalization ability. However, their patchified or pooled outputs are inherently low-resolution, limiting their effectiveness in tasks requiring fine-grained, pixel-level reasoning. Existing feature upsampling approaches either degrade semantic fidelity or rely on VFM-specific retraining and heavy architectures, hindering efficiency and scalability. To address these challenges, we propose RaysUp, an ultra-lightweight, task-agnostic, and VFM-agnostic feature upsampling framework that reconstructs high-resolution feature maps at arbitrary resolutions. Unlike conventional 2D interpolation or attention-based schemes, RaysUp lifts feature reconstruction into a geometry-aware ray domain. Specifically, we introduce a Spatially Decoupled Guidance Encoder for direction-aware guidance encoding, an Any-Resolution Cross-Attention mechanism for resolution-flexible reconstruction, and a novel Ray Positional Encoding (RayPE) that injects implicit 3D geometric priors via 6D Plucker ray coordinates. Finally, a Geometry-Aware Neighborhood Attention module further ensures content-adaptive bilateral aggregation while preserving geometric consistency. Extensive experiments across diverse dense prediction tasks demonstrate that RaysUp achieves state-of-the-art performance while using only 16% of the parameters of AnyUp and delivering approximately 7x faster inference. These results highlight a substantially improved accuracy-efficiency trade-off and establish RaysUp as a practical and scalable solution for universal feature upsampling. Code is available at https://github.com/MAP-RaysUp/RaysUp.

Original Article

View Cached Full Text

Cached at: 06/30/26, 03:37 PM

Paper page - RaysUp: Ultra-light Universal Feature Upsampling via Geometry-Aware Ray Representation

Source: https://huggingface.co/papers/2606.22749

Abstract

RaysUp is a lightweight, task-agnostic feature upsampling framework that reconstructs high-resolution features using geometry-aware ray domain techniques with improved efficiency and accuracy.

Pre-trainedVision Foundation Models(VFMs) have become central to modern computer vision due to their powerful semantic representations and strong generalization ability. However, their patchified or pooled outputs are inherently low-resolution, limiting their effectiveness in tasks requiring fine-grained, pixel-level reasoning. Existingfeature upsamplingapproaches either degrade semantic fidelity or rely on VFM-specific retraining and heavy architectures, hindering efficiency and scalability. To address these challenges, we propose RaysUp, an ultra-lightweight, task-agnostic, and VFM-agnosticfeature upsamplingframework that reconstructs high-resolution feature maps at arbitrary resolutions. Unlike conventional 2D interpolation or attention-based schemes, RaysUp lifts feature reconstruction into a geometry-aware ray domain. Specifically, we introduce aSpatially Decoupled Guidance Encoderfor direction-aware guidance encoding, anAny-Resolution Cross-Attentionmechanism for resolution-flexible reconstruction, and a novelRay Positional Encoding(RayPE) that injects implicit 3D geometric priors via6D Plucker ray coordinates. Finally, aGeometry-Aware Neighborhood Attentionmodule further ensures content-adaptive bilateral aggregation while preserving geometric consistency. Extensive experiments across diversedense prediction tasksdemonstrate that RaysUp achieves state-of-the-art performance while using only 16% of the parameters of AnyUp and delivering approximately 7x faster inference. These results highlight a substantially improved accuracy-efficiency trade-off and establish RaysUp as a practical and scalable solution for universalfeature upsampling. Code is available at https://github.com/MAP-RaysUp/RaysUp.

View arXiv page View PDF Project page GitHub10 Add to collection

Get this paper in your agent:

hf papers read 2606\.22749

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2606.22749 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2606.22749 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2606.22749 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

RaysUp: Ultra-light Universal Feature Upsampling via Geometry-Aware Ray Representation

Paper page - RaysUp: Ultra-light Universal Feature Upsampling via Geometry-Aware Ray Representation

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

ViT-Up: Faithful Feature Upsampling for Vision Transformers

RayDer: Scalable Self-Supervised Novel View Synthesis from Real-World Video

Lite3R: A Model-Agnostic Framework for Efficient Feed-Forward 3D Reconstruction

UltraFlux: Data-Model Co-Design for High-quality Native 4K Text-to-Image Generation across Diverse Aspect Ratios

SurGe: Improved Surface Geometry in Point Maps

Submit Feedback

Similar Articles

ViT-Up: Faithful Feature Upsampling for Vision Transformers

RayDer: Scalable Self-Supervised Novel View Synthesis from Real-World Video

Lite3R: A Model-Agnostic Framework for Efficient Feed-Forward 3D Reconstruction

UltraFlux: Data-Model Co-Design for High-quality Native 4K Text-to-Image Generation across Diverse Aspect Ratios

SurGe: Improved Surface Geometry in Point Maps