340万块太阳能板
摘要
GM-SEUS 开放数据集第二版现已标注 340 万块美国太阳能板,并新增屋顶阵列,较第一版的 290 万块有所提升。
暂无内容
查看缓存全文
缓存时间: 2026/04/22 12:40
# 340 万块太阳能板
来源:https://tech.marksblogg.com/american-solar-farms-v2.html
10 月,我评测了《美国地面光伏电站数据集》(GM-SEUS)。该数据集试图勾勒出美国绝大多数光伏阵列与组件的分布。第一版共收录 290 万块组件。
本周一发布的第二版已增至 340 万块以上,并新增屋顶阵列子数据集。
本文将带你浏览 GM-SEUS v2。
## 我的工作台
- CPU:5.7 GHz AMD Ryzen 9 9950X,16 核 32 线程,L1 1.2 MB / L2 16 MB / L3 64 MB,一体水冷,机箱为全塔 Cooler Master HAF 700
- 内存:96 GB DDR5 4800 MT/s
- 系统盘:第五代 Crucial T700 4 TB NVMe M.2,持续读取 12,400 MB/s,自带散热片
- 主板:ASRock X870E Nova 90
- 电源:Corsair 1200 W 全模组
- 系统:Windows 11 Pro 运行 Microsoft 的 Ubuntu for Windows,搭载 Ubuntu 24 LTS。之所以不直接上 Linux 桌面,是因为手里的 Nvidia GTX 1080 在 Windows 驱动更稳,且 ArcGIS Pro 仅原生支持 Windows
## 安装依赖
本文使用 GDAL 3.9.3 做数据分析。
```bash
sudo add-apt-repository ppa:ubuntugis/ubuntugis-unstable
sudo apt update
sudo apt install gdal-bin
```
同时用到 DuckDB 及社区扩展:H3、JSON、Lindel、Parquet、Spatial。
```bash
cd ~
wget -c https://github.com/duckdb/duckdb/releases/download/v1.5.1/duckdb_cli-linux-amd64.zip
unzip -j duckdb_cli-linux-amd64.zip
chmod +x duckdb
~/duckdb
```
```sql
INSTALL h3 FROM community;
INSTALL lindel FROM community;
INSTALL json;
INSTALL parquet;
INSTALL spatial;
```
让 DuckDB 启动时自动加载所有已装扩展:
```sql
.timer on
.width 180
LOAD h3;
LOAD lindel;
LOAD json;
LOAD parquet;
LOAD spatial;
```
地图由 QGIS 4.0.1 渲染,该桌面应用支持 Windows / macOS / Linux,全球每月启动约 1500 万次。底图通过 QGIS 的 HCMGIS 插件调用 Esri 服务。
## 分析就绪数据集
先下载 3.4 GB 的 ZIP,再提取其中的 GeoPackage(GPKG)。
```bash
wget -O GMSEUS_v2.zip \
'https://zenodo.org/records/19581821/files/GMSEUS.zip?download=1'
unzip -j GMSEUS_v2.zip "*.gpkg"
```
查看 GPKG 的投影:
```bash
gdalsrsinfo -o proj4 GMSEUS_RooftopArrays_2025_v2_0.gpkg
```
```
+proj=aea +lat_0=23 +lon_0=-96 +lat_1=29.5 +lat_2=45.5 +x_0=0 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs
```
将屋顶阵列转为 Parquet(因 v1.5.1 会抛异常,此处用 DuckDB v1.4.4):
```sql
COPY (
WITH a AS (
SELECT Source,
grndCvr,
modType,
mount,
nativeID,
roofArrID,
area: IF(area::TEXT='-9999.0', NULL, area::DOUBLE),
azimuth: IF(azimuth::TEXT='-9999.0', NULL, azimuth::DOUBLE),
capMWAC: IF(capMWAC::TEXT='-9999.0', NULL, capMWAC::DOUBLE),
capMWDC: IF(capMWDC::TEXT='-9999.0', NULL, capMWDC::DOUBLE),
tilt: IF(tilt::TEXT='-9999.0', NULL, tilt::DOUBLE),
instYr: CASE WHEN instYr::INT = -9999 THEN NULL ELSE instYr END,
ST_FLIPCOORDINATES(
ST_TRANSFORM(
geom,
'+proj=aea +lat_0=23 +lon_0=-96 +lat_1=29.5 +lat_2=45.5 +x_0=0 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs',
'EPSG:4326')) geometry
FROM ST_READ('GMSEUS_RooftopArrays_2025_v2_0.gpkg')
)
SELECT * EXCLUDE (geometry),
{'xmin': ST_XMIN(ST_EXTENT(geometry)),
'ymin': ST_YMIN(ST_EXTENT(geometry)),
'xmax': ST_XMAX(ST_EXTENT(geometry)),
'ymax': ST_YMAX(ST_EXTENT(geometry))} AS bbox,
ST_ASWKB(geometry) geometry
FROM a
ORDER BY HILBERT_ENCODE([ST_Y(ST_CENTROID(geometry)),
ST_X(ST_CENTROID(geometry))]::double[2])
) TO 'GMSEUS_RooftopArrays_2025_v2_0.parquet' (
FORMAT 'PARQUET',
CODEC 'ZSTD',
COMPRESSION_LEVEL 22,
ROW_GROUP_SIZE 15000);
```
屋顶阵列共 5,822 条记录。
```sql
SELECT COUNT(*) FROM 'GMSEUS_RooftopArrays_2025_v2_0.parquet';
```
各字段唯一值与空值占比:
| column_name | column_type | null_percentage | approx_unique | min | max |
|-------------|-------------|-----------------|---------------|-----|-----|
| area | DOUBLE | 2.77 | 5,180 | 15.0 | 487,111.0 |
| azimuth | DOUBLE | 89.63 | 156 | 0.0 | 530.02 |
| capMWAC | DOUBLE | 89.52 | 60 | 0.2 | 74.9 |
| capMWDC | DOUBLE | 87.12 | 166 | 0.00448 | 99.7 |
| grndCvr | VARCHAR | 97.61 | 2 | impervious | vegetation |
| instYr | BIGINT | 72.43 | 23 | 2003 | 2025 |
| modType | VARCHAR | 0.00 | 2 | c-si | thin-film |
| mount | VARCHAR | 87.53 | 5 | dual_axis | unknown |
| nativeID | VARCHAR | 0.00 | 4,540 | 1 | Xebec 1 solar farm |
| roofArrID | BIGINT | 0.00 | 5,830 | 1 | 5,822 |
| Source | VARCHAR | 0.00 | 15 | CCVPV | gspt |
| tilt | DOUBLE | 90.64 | 31 | 0.0 | 52.0 |
继续将组件级数据转为 Parquet:
```sql
COPY (
WITH a AS (
SELECT Source,
arrayID: arrayID::INT,
panelID: panelID::INT,
pnlSource,
rowArea,
rowAzimuth,
rowLength,
rowMount,
rowSpace: IF(rowSpace::TEXT='-9999.0', NULL, rowSpace::DOUBLE),
rowWidth,
ST_FLIPCOORDINATES(
ST_TRANSFORM(
geom,
'+proj=aea +lat_0=23 +lon_0=-96 +lat_1=29.5 +lat_2=45.5 +x_0=0 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs',
'EPSG:4326')) geometry
FROM ST_READ('GMSEUS_Panels_Final_2025_v2_0.gpkg')
)
SELECT * EXCLUDE (geometry),
{'xmin': ST_XMIN(ST_EXTENT(geometry)),
'ymin': ST_YMIN(ST_EXTENT(geometry)),
'xmax': ST_XMAX(ST_EXTENT(geometry)),
'ymax': ST_YMAX(ST_EXTENT(geometry))} AS bbox,
ST_ASWKB(geometry) geometry
FROM a
ORDER BY HILBERT_ENCODE([ST_Y(ST_CENTROID(geometry)),
ST_X(ST_CENTROID(geometry))]::double[2])
) TO 'GMSEUS_Panels_Final_2025_v2_0.parquet' (
FORMAT 'PARQUET',
CODEC 'ZSTD',
COMPRESSION_LEVEL 22,
ROW_GROUP_SIZE 15000);
```
组件共 3,429,157 条记录。
```sql
SELECT COUNT(*) FROM 'GMSEUS_Panels_Final_2025_v2_0.parquet';
```
组件字段概览:
| column_name | column_type | null_percentage | approx_unique | min | max |
|-------------|-------------|-----------------|---------------|-----|-----|
| arrayID | INTEGER | 0.03 | 12,653 | 1 | 18,980 |
| panelID | INTEGER | 0.00 | 3,323,765 | 1 | 3,429,157 |
| pnlSource | VARCHAR | 0.00 | 5 | CCVPV | OSM |
| rowArea | DOUBLE | 0.00 | 100,105 | 15.01 | 9,982.68 |
| rowAzimuth | DOUBLE | 0.00 | 22,029 | 90.0 | 540.0 |
| rowLength | DOUBLE | 0.00 | 25,531 | 3.96 | 737.38 |
| rowMount | VARCHAR | 0.00 | 3 | dual_axis | single_axis |
| rowSpace | DOUBLE | 1.27 | 1,836 | 0.01 | 20.0 |
| rowWidth | DOUBLE | 0.00 | 2,258 | 0.45 | 135.33 |
| Source | VARCHAR | 0.00 | 12 | CCVPV | USPVDB |
最后把阵列级数据也转 Parquet:
```sql
COPY (
WITH a AS (
SELECT COUNTYFP,
GCR1,
GCR2,
STATEFP,
Source,
arrayID,
avgAzimuth: IF(avgAzimuth::TEXT='-9999.0', NULL, avgAzimuth::DOUBLE),
avgLength: IF(avgLength::TEXT='-9999.0', NULL, avgLength::DOUBLE),
avgSpace: IF(avgSpace::TEXT='-9999.0', NULL, avgSpace::DOUBLE),
avgWidth: IF(avgWidth::TEXT='-9999.0', NULL, avgWidth::DOUBLE),
capMWAC,
capMWACest,
capMWDC,
capMWDCest,
effInit: IF(effInit::TEXT='-9999.0', NULL, effInit::DOUBLE),
grndCvr,
instYr: CASE WHEN instYr::INT = -9999 THEN NULL ELSE instYr END,
instYrEst: CASE WHEN instYrEst::INT = -9999 THEN NULL ELSE instYrEst END,
modType,
mount,
nativeID: IF(nativeID::TEXT='unknown', NULL, nativeID),
newBound,
numRow,
tilt: IF(tilt::TEXT='-9999.0', NULL, tilt::DOUBLE),
tiltEst: CASE WHEN tiltEst::INT = -9999 THEN NULL ELSE tiltEst END,
totArea,
totRowArea,
ST_FLIPCOORDINATES(
ST_TRANSFORM(
geom,
'+proj=aea +lat_0=23 +lon_0=-96 +lat_1=29.5 +lat_2=45.5 +x_0=0 +y_0=0 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs',
'EPSG:4326')) geometry
FROM ST_READ('GMSEUS_Arrays_Final_2025_v2_0.gpkg')
)
SELECT * EXCLUDE (geometry),
{'xmin': ST_XMIN(ST_EXTENT(geometry)),
'ymin': ST_YMIN(ST_EXTENT(geometry)),
'xmax': ST_XMAX(ST_EXTENT(geometry)),
'ymax': ST_YMAX(ST_EXTENT(geometry))} AS bbox,
ST_ASWKB(geometry) geometry
FROM a
ORDER BY HILBERT_ENCODE([ST_Y(ST_CENTROID(geometry)),
ST_X(ST_CENTROID(geometry))]::double[2])
) TO 'GMSEUS_Arrays_Final_2025_v2_0.parquet' (
FORMAT 'PARQUET',
CODEC 'ZSTD',
COMPRESSION_LEVEL 22,
ROW_GROUP_SIZE 15000);
```
阵列共 18,980 条记录。
```sql
SELECT COUNT(*) FROM 'GMSEUS_Arrays_Final_2025_v2_0.parquet';
```
阵列字段概览(节选):
| column_name | column_type | null_percentage | approx_unique | min | max |
|-------------|-------------|-----------------|---------------|-----|-----|
| arrayID | BIGINT | 0.00 | 16,914 | 1 | 18,980 |
| avgAzimuth | DOUBLE | 32.88 | 5,865 | 90.22 | 540.0 |
| avgLength | DOUBLE | 32.88 | 12,674 | 4.02 | 360.128 |
| avgSpace | DOUBLE | 32.88 | 7,965 | 0.024 | 20.0 |
| avgWidth | DOUBLE | 32.88 | 6,528 | 0.863 | 83.13 |
| capMWAC | DOUBLE | 0.00 | 4,182 | 0.003 | 1,128.931 |
| capMWACest | DOUBLE | 0.00 | 6,620 | 0.003 | 1,352.693 |
| capMWDC | DOUBLE | 0.00 | 7,203 | 0.003 | 1,467.61 |
| capMWDCest | DOUBLE | 0.00 | 8,883 | 0.004 | 1,758.501 |
| COUNTYFP | VARCHAR | 0.00 | 237 | 001 | 840 |
| effInit | DOUBLE | 0.07 | 12 | 0.09 | 0.21 |
| GCR1 | DOUBLE | 0.00 | 5,096 | 0.1 | 1.0 |
| GCR2 | DOUBLE | 0.00 | 5,913 | 0.0943 | 0.9883 |
| grndCvr | VARCHAR | 0.00 | 3 | impervious | vegetation |
| instYr | BIGINT | 0.00 | 26 | 1985 | 2025 |
| instYrEst | BIGINT | 0.32 | 23 | 2003 | 2025 |
| modType | VARCHAR | 0.00 | 3 | c-si | thin-film |
| mount | VARCHAR | 0.00 | 9 | dual_axis | unknown |
| nativeID | VARCHAR | 0.23 | 14,342 | 1 | York Solar |
| newBound | BIGINT | 0.00 | 2 | 0 | 1 |
| numRow | INTEGER | 0.00 | 1,177 | 0 | 90,145 |
相似文章
全球太阳能增长“创所有能源之最”
国际能源署报告称,2025 年出现有史以来最大的单年全球发电量增幅,其中光伏新增装机满足超过三分之二的新增电力需求,电池储能新增 110 GW。
基于大语言模型引导树搜索的优化三维光伏结构
本文介绍了一个案例研究,使用大语言模型驱动的树搜索算法(ERA)结合编码代理(AntiGravity)自主生成高效三维光伏结构,克服了中纬度地区平板太阳能电池板的局限性。工作流程包括迭代修补以消除奖励黑客行为,并在各种约束条件下发现改进的设计。
Tesla Solar Roof 濒临停产,转向太阳能电池板
特斯拉的 Solar Roof 在很大程度上未能实现其雄心勃勃的目标,安装量仅约 3,000 套,而承诺的是每周 1,000 套。该公司已悄然转向传统太阳能电池板,停止报告部署数量,并转向第三方安装商,导致客户面临一个不再被优先考虑的产品。
CM-EVS:稀疏全景RGB-D位姿数据用于完整场景覆盖
本文提出了COVER,一种无需训练的方法,用于将3D资产转换为具有完整场景覆盖和低冗余度的稀疏全景RGB-D位姿数据,并介绍了包含36,373个来自室内和室外场景的精选帧的CM-EVS数据集。
SENSE:面向可持续环境的基于卫星的能源合成
SENSE是一个生成式城市建筑能耗建模框架,它使用扩散模型合成卫星图像和能耗数据,在减少标注数据需求的同时实现了高保真结果。