SENSE: Satellite-based ENergy Synthesis for Sustainable Environment

Hugging Face Daily Papers Papers

Summary

SENSE is a generative urban building energy modeling framework that synthesizes satellite imagery and energy data using diffusion models, achieving high-fidelity results with reduced labeled data requirements.

Urban Building Energy Modeling plays a critical role in achieving the United Nations' Sustainable Development Goals 7 and 11. Although existing studies based on satellite imagery and deep learning have achieved remarkable progress, many challenges exist: most existing studies are inherently predictive, failing to reflect the generative nature of urban planning; although generative AI and diffusion models have seen explosive growth in satellite imagery, they lack the urban functional generation (e.g., energy layer); third, aligned high-quality high-resolution building energy data with satellite imagery is limited and scarce. Here we propose SENSE (Satellite-based ENergy Synthesis for Sustainable Environment), a unified generative UBEM framework that jointly synthesizes realistic urban satellite imagery and aligned high-quality building energy consumption and height maps. By conditioning on road networks and urban density metrics, SENSE, based on a controllable diffusion model, leverages the knowledge learned by large vision models to generate urban building energy consumption and height information (annotations) in the latent space. Experiments across four cities (New York City, Boston, Lyon, Busan) demonstrate that SENSE achieves high visual fidelity and strong physical consistency, satisfying the ASHRAE standard metric. Experiments demonstrate that SENSE can generate enough annotated synthetic data using less than 20% labeled energy data, boosting downstream prediction performance by 10% IoU. Compared to SOTA urban energy prediction methods, SENSE significantly reduced prediction error (reduced 3%-11% NMBE and 1%-9% CVRMSE). This study offers an energy-efficiency urban planning and physical generation solution for urban science, energy science and building science. The dataset and code: https://huggingface.co/datasets/skl24/MUSE and https://github.com/kailaisun/GenAI4Urban-Energy/.
Original Article
View Cached Full Text

Cached at: 05/21/26, 06:20 AM

Paper page - SENSE: Satellite-based ENergy Synthesis for Sustainable Environment

Source: https://huggingface.co/papers/2605.18101

Abstract

SENSE is a generative urban building energy modeling framework that synthesizes satellite imagery and energy data using diffusion models, achieving high-fidelity results with reduced labeled data requirements.

Urban Building Energy Modelingplays a critical role in achieving the United Nations’ Sustainable Development Goals 7 and 11. Although existing studies based onsatellite imageryand deep learning have achieved remarkable progress, many challenges exist: most existing studies are inherently predictive, failing to reflect the generative nature of urban planning; althoughgenerative AIanddiffusion modelshave seen explosive growth insatellite imagery, they lack the urban functional generation (e.g., energy layer); third, aligned high-quality high-resolution building energy data withsatellite imageryis limited and scarce. Here we propose SENSE (Satellite-based ENergy Synthesis for Sustainable Environment), a unified generative UBEM framework that jointly synthesizes realistic urbansatellite imageryand aligned high-quality building energy consumption and height maps. By conditioning on road networks and urban density metrics, SENSE, based on acontrollable diffusion model, leverages the knowledge learned by large vision models to generate urban building energy consumption and height information (annotations) in thelatent space. Experiments across four cities (New York City, Boston, Lyon, Busan) demonstrate that SENSE achieves high visual fidelity and strong physical consistency, satisfying theASHRAE standardmetric. Experiments demonstrate that SENSE can generate enough annotated synthetic data using less than 20% labeled energy data, boosting downstream prediction performance by 10% IoU. Compared to SOTA urban energy prediction methods, SENSE significantly reduced prediction error (reduced 3%-11%NMBEand 1%-9%CVRMSE). This study offers an energy-efficiency urban planning and physical generation solution for urban science, energy science and building science. The dataset and code: https://huggingface.co/datasets/skl24/MUSE and https://github.com/kailaisun/GenAI4Urban-Energy/.

View arXiv pageView PDFProject pageGitHub1Add to collection

Get this paper in your agent:

hf papers read 2605\.18101

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper1

#### skl24/SENSE Updatedabout 5 hours ago

Datasets citing this paper1

#### skl24/MUSE Updatedabout 5 hours ago • 11

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2605.18101 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

Similar Articles

Ensemble Score Filtering for Real-Data Energy Consumption Forecast Correction

arXiv cs.LG

This paper proposes using the Ensemble Score Filter (EnSF), a score-based diffusion data assimilation method, to correct forecasts from a pretrained spatio-temporal energy consumption model using noisy partial observations. Numerical experiments show EnSF significantly improves state estimation over open-loop propagation and outperforms the Ensemble Kalman Filter under nonlinear observations.

Sat3DGen: Comprehensive Street-Level 3D Scene Generation from Single Satellite Image

Hugging Face Daily Papers

Sat3DGen introduces a geometry-first approach for generating street-level 3D scenes from a single satellite image, achieving improved geometric accuracy and photorealism through novel constraints and training strategies. The method demonstrates significant improvements over prior work on the VIGOR-OOD benchmark.

Efficient Image Synthesis with Sphere Latent Encoder

Hugging Face Daily Papers

This paper proposes Sphere Latent Encoder, an efficient few-step image generation framework that performs denoising entirely in a spherical latent space, achieving high-quality 256×256 images with significantly reduced computational cost and improved FID scores on ImageNet-1K.