340万块太阳能板

Hacker News Top 新闻

摘要

GM-SEUS 开放数据集第二版现已标注 340 万块美国太阳能板,并新增屋顶阵列,较第一版的 290 万块有所提升。

暂无内容
查看原文
查看缓存全文

缓存时间: 2026/04/22 12:40

# 340 万块太阳能板 来源:https://tech.marksblogg.com/american-solar-farms-v2.html 10 月,我评测了《美国地面光伏电站数据集》(GM-SEUS)。该数据集试图勾勒出美国绝大多数光伏阵列与组件的分布。第一版共收录 290 万块组件。 本周一发布的第二版已增至 340 万块以上,并新增屋顶阵列子数据集。 本文将带你浏览 GM-SEUS v2。 ## 我的工作台 - CPU:5.7 GHz AMD Ryzen 9 9950X,16 核 32 线程,L1 1.2 MB / L2 16 MB / L3 64 MB,一体水冷,机箱为全塔 Cooler Master HAF 700 - 内存:96 GB DDR5 4800 MT/s - 系统盘:第五代 Crucial T700 4 TB NVMe M.2,持续读取 12,400 MB/s,自带散热片 - 主板:ASRock X870E Nova 90 - 电源:Corsair 1200 W 全模组 - 系统:Windows 11 Pro 运行 Microsoft 的 Ubuntu for Windows,搭载 Ubuntu 24 LTS。之所以不直接上 Linux 桌面,是因为手里的 Nvidia GTX 1080 在 Windows 驱动更稳,且 ArcGIS Pro 仅原生支持 Windows ## 安装依赖 本文使用 GDAL 3.9.3 做数据分析。 ```bash sudo add-apt-repository ppa:ubuntugis/ubuntugis-unstable sudo apt update sudo apt install gdal-bin ``` 同时用到 DuckDB 及社区扩展:H3、JSON、Lindel、Parquet、Spatial。 ```bash cd ~ wget -c https://github.com/duckdb/duckdb/releases/download/v1.5.1/duckdb_cli-linux-amd64.zip unzip -j duckdb_cli-linux-amd64.zip chmod +x duckdb ~/duckdb ``` ```sql INSTALL h3 FROM community; INSTALL lindel FROM community; INSTALL json; INSTALL parquet; INSTALL spatial; ``` 让 DuckDB 启动时自动加载所有已装扩展: ```sql .timer on .width 180 LOAD h3; LOAD lindel; LOAD json; LOAD parquet; LOAD spatial; ``` 地图由 QGIS 4.0.1 渲染,该桌面应用支持 Windows / macOS / Linux,全球每月启动约 1500 万次。底图通过 QGIS 的 HCMGIS 插件调用 Esri 服务。 ## 分析就绪数据集 先下载 3.4 GB 的 ZIP,再提取其中的 GeoPackage(GPKG)。 ```bash wget -O GMSEUS_v2.zip \ 'https://zenodo.org/records/19581821/files/GMSEUS.zip?download=1' unzip -j GMSEUS_v2.zip "*.gpkg" ``` 查看 GPKG 的投影: ```bash gdalsrsinfo -o proj4 GMSEUS_RooftopArrays_2025_v2_0.gpkg ``` ``` +proj=aea +lat_0=23 +lon_0=-96 +lat_1=29.5 +lat_2=45.5 +x_0=0 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs ``` 将屋顶阵列转为 Parquet(因 v1.5.1 会抛异常,此处用 DuckDB v1.4.4): ```sql COPY ( WITH a AS ( SELECT Source, grndCvr, modType, mount, nativeID, roofArrID, area: IF(area::TEXT='-9999.0', NULL, area::DOUBLE), azimuth: IF(azimuth::TEXT='-9999.0', NULL, azimuth::DOUBLE), capMWAC: IF(capMWAC::TEXT='-9999.0', NULL, capMWAC::DOUBLE), capMWDC: IF(capMWDC::TEXT='-9999.0', NULL, capMWDC::DOUBLE), tilt: IF(tilt::TEXT='-9999.0', NULL, tilt::DOUBLE), instYr: CASE WHEN instYr::INT = -9999 THEN NULL ELSE instYr END, ST_FLIPCOORDINATES( ST_TRANSFORM( geom, '+proj=aea +lat_0=23 +lon_0=-96 +lat_1=29.5 +lat_2=45.5 +x_0=0 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs', 'EPSG:4326')) geometry FROM ST_READ('GMSEUS_RooftopArrays_2025_v2_0.gpkg') ) SELECT * EXCLUDE (geometry), {'xmin': ST_XMIN(ST_EXTENT(geometry)), 'ymin': ST_YMIN(ST_EXTENT(geometry)), 'xmax': ST_XMAX(ST_EXTENT(geometry)), 'ymax': ST_YMAX(ST_EXTENT(geometry))} AS bbox, ST_ASWKB(geometry) geometry FROM a ORDER BY HILBERT_ENCODE([ST_Y(ST_CENTROID(geometry)), ST_X(ST_CENTROID(geometry))]::double[2]) ) TO 'GMSEUS_RooftopArrays_2025_v2_0.parquet' ( FORMAT 'PARQUET', CODEC 'ZSTD', COMPRESSION_LEVEL 22, ROW_GROUP_SIZE 15000); ``` 屋顶阵列共 5,822 条记录。 ```sql SELECT COUNT(*) FROM 'GMSEUS_RooftopArrays_2025_v2_0.parquet'; ``` 各字段唯一值与空值占比: | column_name | column_type | null_percentage | approx_unique | min | max | |-------------|-------------|-----------------|---------------|-----|-----| | area | DOUBLE | 2.77 | 5,180 | 15.0 | 487,111.0 | | azimuth | DOUBLE | 89.63 | 156 | 0.0 | 530.02 | | capMWAC | DOUBLE | 89.52 | 60 | 0.2 | 74.9 | | capMWDC | DOUBLE | 87.12 | 166 | 0.00448 | 99.7 | | grndCvr | VARCHAR | 97.61 | 2 | impervious | vegetation | | instYr | BIGINT | 72.43 | 23 | 2003 | 2025 | | modType | VARCHAR | 0.00 | 2 | c-si | thin-film | | mount | VARCHAR | 87.53 | 5 | dual_axis | unknown | | nativeID | VARCHAR | 0.00 | 4,540 | 1 | Xebec 1 solar farm | | roofArrID | BIGINT | 0.00 | 5,830 | 1 | 5,822 | | Source | VARCHAR | 0.00 | 15 | CCVPV | gspt | | tilt | DOUBLE | 90.64 | 31 | 0.0 | 52.0 | 继续将组件级数据转为 Parquet: ```sql COPY ( WITH a AS ( SELECT Source, arrayID: arrayID::INT, panelID: panelID::INT, pnlSource, rowArea, rowAzimuth, rowLength, rowMount, rowSpace: IF(rowSpace::TEXT='-9999.0', NULL, rowSpace::DOUBLE), rowWidth, ST_FLIPCOORDINATES( ST_TRANSFORM( geom, '+proj=aea +lat_0=23 +lon_0=-96 +lat_1=29.5 +lat_2=45.5 +x_0=0 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs', 'EPSG:4326')) geometry FROM ST_READ('GMSEUS_Panels_Final_2025_v2_0.gpkg') ) SELECT * EXCLUDE (geometry), {'xmin': ST_XMIN(ST_EXTENT(geometry)), 'ymin': ST_YMIN(ST_EXTENT(geometry)), 'xmax': ST_XMAX(ST_EXTENT(geometry)), 'ymax': ST_YMAX(ST_EXTENT(geometry))} AS bbox, ST_ASWKB(geometry) geometry FROM a ORDER BY HILBERT_ENCODE([ST_Y(ST_CENTROID(geometry)), ST_X(ST_CENTROID(geometry))]::double[2]) ) TO 'GMSEUS_Panels_Final_2025_v2_0.parquet' ( FORMAT 'PARQUET', CODEC 'ZSTD', COMPRESSION_LEVEL 22, ROW_GROUP_SIZE 15000); ``` 组件共 3,429,157 条记录。 ```sql SELECT COUNT(*) FROM 'GMSEUS_Panels_Final_2025_v2_0.parquet'; ``` 组件字段概览: | column_name | column_type | null_percentage | approx_unique | min | max | |-------------|-------------|-----------------|---------------|-----|-----| | arrayID | INTEGER | 0.03 | 12,653 | 1 | 18,980 | | panelID | INTEGER | 0.00 | 3,323,765 | 1 | 3,429,157 | | pnlSource | VARCHAR | 0.00 | 5 | CCVPV | OSM | | rowArea | DOUBLE | 0.00 | 100,105 | 15.01 | 9,982.68 | | rowAzimuth | DOUBLE | 0.00 | 22,029 | 90.0 | 540.0 | | rowLength | DOUBLE | 0.00 | 25,531 | 3.96 | 737.38 | | rowMount | VARCHAR | 0.00 | 3 | dual_axis | single_axis | | rowSpace | DOUBLE | 1.27 | 1,836 | 0.01 | 20.0 | | rowWidth | DOUBLE | 0.00 | 2,258 | 0.45 | 135.33 | | Source | VARCHAR | 0.00 | 12 | CCVPV | USPVDB | 最后把阵列级数据也转 Parquet: ```sql COPY ( WITH a AS ( SELECT COUNTYFP, GCR1, GCR2, STATEFP, Source, arrayID, avgAzimuth: IF(avgAzimuth::TEXT='-9999.0', NULL, avgAzimuth::DOUBLE), avgLength: IF(avgLength::TEXT='-9999.0', NULL, avgLength::DOUBLE), avgSpace: IF(avgSpace::TEXT='-9999.0', NULL, avgSpace::DOUBLE), avgWidth: IF(avgWidth::TEXT='-9999.0', NULL, avgWidth::DOUBLE), capMWAC, capMWACest, capMWDC, capMWDCest, effInit: IF(effInit::TEXT='-9999.0', NULL, effInit::DOUBLE), grndCvr, instYr: CASE WHEN instYr::INT = -9999 THEN NULL ELSE instYr END, instYrEst: CASE WHEN instYrEst::INT = -9999 THEN NULL ELSE instYrEst END, modType, mount, nativeID: IF(nativeID::TEXT='unknown', NULL, nativeID), newBound, numRow, tilt: IF(tilt::TEXT='-9999.0', NULL, tilt::DOUBLE), tiltEst: CASE WHEN tiltEst::INT = -9999 THEN NULL ELSE tiltEst END, totArea, totRowArea, ST_FLIPCOORDINATES( ST_TRANSFORM( geom, '+proj=aea +lat_0=23 +lon_0=-96 +lat_1=29.5 +lat_2=45.5 +x_0=0 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs', 'EPSG:4326')) geometry FROM ST_READ('GMSEUS_Arrays_Final_2025_v2_0.gpkg') ) SELECT * EXCLUDE (geometry), {'xmin': ST_XMIN(ST_EXTENT(geometry)), 'ymin': ST_YMIN(ST_EXTENT(geometry)), 'xmax': ST_XMAX(ST_EXTENT(geometry)), 'ymax': ST_YMAX(ST_EXTENT(geometry))} AS bbox, ST_ASWKB(geometry) geometry FROM a ORDER BY HILBERT_ENCODE([ST_Y(ST_CENTROID(geometry)), ST_X(ST_CENTROID(geometry))]::double[2]) ) TO 'GMSEUS_Arrays_Final_2025_v2_0.parquet' ( FORMAT 'PARQUET', CODEC 'ZSTD', COMPRESSION_LEVEL 22, ROW_GROUP_SIZE 15000); ``` 阵列共 18,980 条记录。 ```sql SELECT COUNT(*) FROM 'GMSEUS_Arrays_Final_2025_v2_0.parquet'; ``` 阵列字段概览(节选): | column_name | column_type | null_percentage | approx_unique | min | max | |-------------|-------------|-----------------|---------------|-----|-----| | arrayID | BIGINT | 0.00 | 16,914 | 1 | 18,980 | | avgAzimuth | DOUBLE | 32.88 | 5,865 | 90.22 | 540.0 | | avgLength | DOUBLE | 32.88 | 12,674 | 4.02 | 360.128 | | avgSpace | DOUBLE | 32.88 | 7,965 | 0.024 | 20.0 | | avgWidth | DOUBLE | 32.88 | 6,528 | 0.863 | 83.13 | | capMWAC | DOUBLE | 0.00 | 4,182 | 0.003 | 1,128.931 | | capMWACest | DOUBLE | 0.00 | 6,620 | 0.003 | 1,352.693 | | capMWDC | DOUBLE | 0.00 | 7,203 | 0.003 | 1,467.61 | | capMWDCest | DOUBLE | 0.00 | 8,883 | 0.004 | 1,758.501 | | COUNTYFP | VARCHAR | 0.00 | 237 | 001 | 840 | | effInit | DOUBLE | 0.07 | 12 | 0.09 | 0.21 | | GCR1 | DOUBLE | 0.00 | 5,096 | 0.1 | 1.0 | | GCR2 | DOUBLE | 0.00 | 5,913 | 0.0943 | 0.9883 | | grndCvr | VARCHAR | 0.00 | 3 | impervious | vegetation | | instYr | BIGINT | 0.00 | 26 | 1985 | 2025 | | instYrEst | BIGINT | 0.32 | 23 | 2003 | 2025 | | modType | VARCHAR | 0.00 | 3 | c-si | thin-film | | mount | VARCHAR | 0.00 | 9 | dual_axis | unknown | | nativeID | VARCHAR | 0.23 | 14,342 | 1 | York Solar | | newBound | BIGINT | 0.00 | 2 | 0 | 1 | | numRow | INTEGER | 0.00 | 1,177 | 0 | 90,145 |

相似文章

全球太阳能增长“创所有能源之最”

Hacker News Top

国际能源署报告称,2025 年出现有史以来最大的单年全球发电量增幅,其中光伏新增装机满足超过三分之二的新增电力需求,电池储能新增 110 GW。

基于大语言模型引导树搜索的优化三维光伏结构

arXiv cs.CL

本文介绍了一个案例研究,使用大语言模型驱动的树搜索算法(ERA)结合编码代理(AntiGravity)自主生成高效三维光伏结构,克服了中纬度地区平板太阳能电池板的局限性。工作流程包括迭代修补以消除奖励黑客行为,并在各种约束条件下发现改进的设计。

Tesla Solar Roof 濒临停产,转向太阳能电池板

Hacker News Top

特斯拉的 Solar Roof 在很大程度上未能实现其雄心勃勃的目标,安装量仅约 3,000 套,而承诺的是每周 1,000 套。该公司已悄然转向传统太阳能电池板,停止报告部署数量,并转向第三方安装商,导致客户面临一个不再被优先考虑的产品。

CM-EVS:稀疏全景RGB-D位姿数据用于完整场景覆盖

Hugging Face Daily Papers

本文提出了COVER,一种无需训练的方法,用于将3D资产转换为具有完整场景覆盖和低冗余度的稀疏全景RGB-D位姿数据,并介绍了包含36,373个来自室内和室外场景的精选帧的CM-EVS数据集。