在多语言 Monorepo 中使用 Changesets

Hacker News Top 2026/04/21 06:25 工具

monorepo versioning changesets polyglot devops automation open-source

摘要

# 在多语言 Monorepo 中使用 Changesets 来源：[https://luke.hsiao.dev/blog/changesets-polyglot-monorepo/](https://luke.hsiao.dev/blog/changesets-polyglot-monorepo/) 在规模较小的企业工作有一个优势，那就是你可以使用那些无需向超大规模扩展的工具。软件开发领域的一个例子就是 Monorepo。虽然 Monorepo *确实能够*很好地扩展（例如 Google、Facebook 等），但这样做需要特殊的工具链以及更复...

暂无内容

查看原文

查看缓存全文

缓存时间: 2026/04/21 07:07

# 在 polyglot monorepo 中使用 Changesets Source: https://luke.hsiao.dev/blog/changesets-polyglot-monorepo/ 在小公司工作的一个好处是，你可以尽情享受那些不需要扩展到极端规模就能正常运行的工具。软件领域的一个典型例子就是 monorepo（https://en.wikipedia.org/wiki/Monorepo）。虽然 monorepo*可以*很好地支撑大规模扩展（参见 Google、Facebook 等企业的实践），但这需要专门的基础设施与配套工具。仅依赖原生 `git`，你的天花板会比较低（https://wellarchitected.github.com/library/architecture/recommendations/scaling-git-repositories/repository-architecture-strategy/）。尽管完全可以只用它，但 monorepo 具备许多明显优势，例如能够在一个 commit 中完成影响系统多个模块的原子性变更，从而彻底规避一整类兼容性问题和集成陷阱。你当然也可以在未来拆分 monorepo（参见 `git-filter-repo`（https://github.com/newren/git-filter-repo））。那么，假设你所在的是一支中小团队，并且正在使用 monorepo。我们再进一步假设，这个 monorepo 存储了你公司的所有代码，意味着它跨越了多种编程语言——它是一个 polyglot monorepo。在这种情况下，你应该使用什么工具来一致地管理版本呢？我认为 `changesets`（https://github.com/changesets/changesets）是一个相当靠谱的选择，尽管它的核心设计主要面向 JavaScript/TypeScript 生态。 ## 背景 https://luke.hsiao.dev/blog/changesets-polyglot-monorepo/#background 对于任何版本管理工具，你通常需要关注它是否能做到以下几点： - 定义最终显示在更新日志/发布说明（changelog/release notes）中的内容 - 控制各个包的版本号递增规则 - 自动化执行元数据升级和打标签的 commit - 自动化触发对应的构建流程 `changesets` 默认采用基于包粒度的语义化版本控制（https://semver.org/）（即每个包都拥有独立的版本号）。此外，每个包还会维护自己独立的 `CHANGELOG.md`。 `changesets` 团队还提供了一款配套的 GitHub Action —— `changesets/action`（https://github.com/changesets/action），其核心价值在于允许为 `version` 和 `publish` 命令注入自定义脚本。正是这种高度可扩展的定制能力，让 `changesets` 得以支持多语言仓库。在 `changesets` 的工作流中，工程师会将“changeset”文件提交到仓库。这些文件决定了哪些内容最终会落入更新日志，以及哪些包的版本需要进行升级（即 major、minor 或 patch）。更多细节请参阅 `changesets` 官方文档（https://github.com/changesets/changesets/blob/main/docs/intro-to-using-changesets.md）。 ## 在 GitHub 上实现自动化发布流程 https://luke.hsiao.dev/blog/changesets-polyglot-monorepo/#implementing-an-automated-release-process-on-github 我是 `just`（https://just.systems/）的重度用户。我也非常推崇 `uv` 脚本功能（https://docs.astral.sh/uv/guides/scripts/）。下面的示例同时用到了这两个工具。另外，我假设你身处企业环境，整个 monorepo 均为私有仓库，而非开源项目。 ### 仓库目录结构 https://luke.hsiao.dev/blog/changesets-polyglot-monorepo/#repository-setup 我推荐的目录组织方式（至少截至写作时）大致如下： ``` . ├── .changeset │ ├── config.json │ └── README.md ├── contrib │ └── utils ├── docker │ └── Dockerfile ├── docs │ ├── package.json │ ├── pnpm-lock.yaml │ ├── ... │ └── pnpm-workspace.yaml ├── Justfile ├── package-lock.json ├── package.json ├── packages │ ├── python-one │ │ ├── ... │ │ └── package.json │ ├── rust-one │ │ ├── ... │ │ └── package.json │ └── rust-two │ ├── ... │ └── package.json ├── pnpm-workspace.yaml └── third-party ``` 无论你使用的是哪种编程语言，请将所有业务包统一放置在 `packages/` 目录下。我个人也很推崇“文档即代码”（https://www.writethedocs.org/guide/docs-as-code/）的理念，因此假设你同时还维护了一个 `docs/` 目录，且文档是基于 JavaScript 的前端框架（例如 Starlight（https://starlight.astro.build/））编写的。这样安排主要是为了在后文讲解某些细微差异时提供铺垫。 ### Changeset 配置 https://luke.hsiao.dev/blog/changesets-polyglot-monorepo/#changeset-configuration 采用上述目录结构后，你可以利用根目录下的代理型 `pnpm` workspace 来配置 `changesets`，将全部包纳入统一管理： ```yaml # pnpm-workspace.yaml packages: - "packages/**" ``` 接着声明你的 `changesets` 依赖： ```json // package.json { "name": "example-monorepo", "private": true, "devDependencies": { "@changesets/changelog-git": "^0.2.0", "@changesets/cli": "^2.29.0" } } ``` 此时你还应更新 `\.gitignore`： ``` node_modules/ ``` 由于 `changesets` 专为 JavaScript 设计，我们需要为所有非 JS 包提供“代理” `package.json` 文件；`changesets` 正是依靠这些文件来判断并执行版本升级。它们的内容可以极其简单： ```json // packages/python-one/package.json { "name": "python-one", "version": "0.1.0", "private": true } ``` 注意上述结构的设计意图：我们*有意*将内部的 `docs/` 排除在根级 pnpm workspace 之外——因为我们只希望对业务包进行版本控制。要实现这一点，需将 `docs/` 声明为它*自身独立*的 `pnpm` workspace，否则 `pnpm` 会尝试把 `docs/` 的依赖也合并进根目录的 `package-lock.json` 中。操作同样很简单： ```yaml # docs/pnpm-workspace.pyml packages: [] ``` 接下来，配置 `.changeset/config.json`： ```json // .changeset/config.json { "$schema": "https://unpkg.com/@changesets/[email protected]/schema.json", "changelog": "@changesets/changelog-git", "commit": false, "fixed": [], "linked": [], "access": "restricted", "baseBranch": "main", "updateInternalDependencies": "patch", "ignore": [], "privatePackages": { "version": true, "tag": true }, "___experimentalUnsafeOptions_WILL_CHANGE_IN_PATCH": { "onlyUpdatePeerDependentsWhenOutOfRange": true } } ``` ### 使用 GitHub 实现自动化发布 https://luke.hsiao.dev/blog/changesets-polyglot-monorepo/#automating-releases-with-github #### 串联多语言版本 PR 的核心逻辑 https://luke.hsiao.dev/blog/changesets-polyglot-monorepo/#the-glue-to-create-polyglot-versioning-prs 下一步，我们来将发布流程自动化。具体而言，就是自动创建包含更新日志的 PR、升级包元数据、推送 Git 标签，并触发针对这些标签的构建任务。我们先来看 GitHub Workflow 的定义，并逐步拆解它所调用的脚本： ```yaml name: Release on: push: branches: - main concurrency: ${{ github.workflow }}-${{ github.ref }} permissions: contents: write pull-requests: write jobs: release: name: Release runs-on: ubuntu-latest outputs: published: ${{ steps.changesets.outputs.published }} steps: - uses: actions/checkout@v6 - uses: actions/setup-node@v4 with: cache: npm - uses: astral-sh/setup-uv@v7 - uses: taiki-e/install-action@just - run: npm install - name: Create Release Pull Request or Tag id: changesets uses: changesets/action@v1 with: version: just version publish: npx @changesets/cli publish # 我喜欢符合 conventional commits 规范的提交信息 commit: "chore(release): version packages" title: "chore(release): version packages" env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} docker: needs: [release] if: needs.release.outputs.published == 'true' uses: ./.github/workflows/docker.yml secrets: inherit ``` 你可能会疑惑，为什么我们要显式运行工作流，而不是像直觉那样直接使用 `on.push.tags` 作为触发条件。事实证明，GitHub 在那种直觉做法上存在两个致命缺陷（截至写作时）。首先，如果你一次性推送超过 3 个标签，工作流将*不会*触发（https://github.com/changesets/changesets/issues/1545）。不幸的是，这在 monorepo 中是非常常见的场景。其次，GitHub 对 `on.push.tags` 的触发机制非常不稳定（https://github.com/orgs/community/discussions/27028）。即使严格按照官方指引使用 Personal Access Token（PAT），这种不稳定性依然存在（https://docs.github.com/en/actions/how-tos/write-workflows/choose-when-workflows-run/trigger-a-workflow#triggering-a-workflow-from-a-workflow）。因此，更稳妥的做法是像我在上文演示的那样，显式使用 `workflow_call`。设置 `version: just version` 是实现多语言支持的点睛之笔： ```makefile # Version packages based on changesets [doc('Consume changesets: bump versions, update changelogs, sync native version files.')] [group('release')] version: npx @changesets/cli version uv run --script contrib/utils/sync-versions.py ``` 而多语言支持的核心胶水代码，则完全取决于你如何编写 `sync-versions.py`。关键在于：当我们调用 `npx @changesets/cli version` 时，我们依赖 `changesets` 替我们升级 `package.json` 里的版本号；但随后，必须由我们自己编写逻辑，将这些新版本号准确地同步到对应语言的元数据文件中。下面提供一个采用相对朴素解析方式的 Python 示例。你完全可以基于自己的技术栈编写相似（甚至更健壮！）的实现： ```python #!/usr/bin/env -S uv run --script # # /// script # requires-python = ">=3.12" # dependencies = [] # /// # # Sync versions from package.json files (updated by changesets) to native # package manifests (Cargo.toml, pyproject.toml, etc.). import json import re import subprocess from enum import Enum, auto from pathlib import Path PACKAGES_DIR = Path(__file__).resolve().parent.parent.parent / "packages" class SyncResult(Enum): NOT_FOUND = auto() UP_TO_DATE = auto() UPDATED = auto() def read_package_json(pkg_dir: Path) -> dict | None: """Read and parse a package.json file.""" pkg_json = pkg_dir / "package.json" if not pkg_json.exists(): return None return json.loads(pkg_json.read_text()) def update_cargo_toml(pkg_dir: Path, version: str) -> SyncResult: """Update version in [package] section of Cargo.toml.""" cargo_toml = pkg_dir / "Cargo.toml" if not cargo_toml.exists(): return SyncResult.NOT_FOUND lines = cargo_toml.read_text().splitlines(keepends=True) in_package_section = False for i, line in enumerate(lines): stripped = line.strip() # Track which TOML section we're in if stripped.startswith("["): in_package_section = stripped == "[package]" continue if in_package_section and stripped.startswith("version"): new_line = re.sub( r'^(\s*version\s*=\s*")([^"]+)(")', rf"\g<1>{version}\3", line, ) if new_line != line: lines[i] = new_line cargo_toml.write_text("".join(lines)) rel = cargo_toml.relative_to(PACKAGES_DIR.parent) print(f" Updated {rel}") return SyncResult.UPDATED return SyncResult.UP_TO_DATE return SyncResult.UP_TO_DATE def update_pyproject_toml(pkg_dir: Path, version: str) -> SyncResult: """Update version in [project] section of pyproject.toml.""" pyproject = pkg_dir / "pyproject.toml" if not pyproject.exists(): return SyncResult.NOT_FOUND lines = pyproject.read_text().splitlines(keepends=True) in_project_section = False for i, line in enumerate(lines): stripped = line.strip() # Track which TOML section we're in if stripped.startswith("["): in_project_section = stripped == "[project]" continue if in_project_section and stripped.startswith("version"): new_line = re.sub( r'^(\s*version\s*=\s*")([^"]+)(")', rf"\g<1>{version}\3", line, ) if new_line != line: lines[i] = new_line pyproject.write_text("".join(lines)) rel = pyproject.relative_to(PACKAGES_DIR.parent) print(f" Updated {rel}") return SyncResult.UPDATED return SyncResult.UP_TO_DATE return SyncResult.UP_TO_DATE def refresh_lockfiles() -> None: """Refresh all lockfiles under the repo to match updated versions.""" repo_root = PACKAGES_DIR.parent print("Refreshing lockfiles...") # Cargo.lock — root workspace + any standalone crate lockfiles cargo_locks = sorted( set(repo_root.glob("Cargo.lock")) | set(PACKAGES_DIR.rglob("Cargo.lock")) ) for cargo_lock in cargo_locks: lock_dir = cargo_lock.parent rel = lock_dir.relative_to(repo_root) or Path(".") print(f" cargo update --workspace in {rel}") subprocess.run(["cargo", "update", "--workspace"], cwd=lock_dir, check=True) # uv.lock — Python packages for uv_lock in sorted(PACKAGES_DIR.rglob("uv.lock")): lock_dir = uv_lock.parent print(f" uv lock in {lock_dir.relative_to(repo_root)}") subprocess.run(["uv", "lock"], cwd=lock_dir, check=True) def main() -> None: print("Syncing versions from package.json to native manifests...") print() updated = 0 for pkg_json in sorted(PACKAGES_DIR.rglob("package.json")): pkg_dir = pkg_json.parent pkg_data = read_package_json(pkg_dir) if pkg_data is None: continue version = pkg_data.get("version") if version is None: continue name = pkg_data.get("name", pkg_dir.name) print(f"{name} @ {version}") results = [ update_cargo_toml(pkg_dir, version), update_pyproject_toml(pkg_dir, version), ] if any(r == SyncResult.UPDATED for r in results): updated += 1 elif all(r == SyncResult.NOT_FOUND for r in results): print(" (no native manifest found)") else: print(" (already up to date)") print() print(f"Synced {updated} package(s).") print() refresh_lockfiles() print() print("Done.") if __name__ == "__main__": main() ``` #### 响应发布的包标签 https://luke.hsiao.dev/blog/changesets-polyglot-monorepo/#reacting-to-package-tags 在标准的 `changesets` 流程下，你现在会在 GitHub 上看到一个新的 Pull Request，其中不仅包含了更新后的 `CHANGELOG.md`，还包含了所有相关包的元数据变更。一旦该 PR 被合并，同一个 Workflow Action 就会再次运行。它检测到所有 `.changeset` 文件均已被消费，便会自动执行标签推送。根据我们的示例配置，`changesets` 仅会推送标签而*不会*自动发包，这是因为我们在 `\.changeset/config.json` 中配置了： ```json "privatePackages": { "version": true, "tag": true } ``` 并且所有包都声明了 `"private": true`。通常，接下来你需要对这些已推送的标签做出响应。例如，触发新 Docker 镜像的构建。为此，与其像直觉预判那样去监听 `on.push.tags`，不如继续沿用 `workflow_call`。具体原因请参考本文前半部分的说明。 ```yaml on: workflow_call: {} workflow_dispatch: inputs: dry_run: description: 'Build images without pushing to GHCR' required: false type: boolean default: false no_cache: description: 'Force a build without using the cache' required: false type: boolean default: false ``` ## 总结 https://luke.hsiao.dev/blog/changesets-polyglot-monorepo/#summary 即便缺乏对多语言的原生直支，`changesets` 如今也能在多语言 monorepo 中完美管理基于包粒度的语义化版本控制与更新日志。核心思路是将 JavaScript 的包清单视为版本升级的单一事实来源（Single Source of Truth），然后通过自定义脚本将这些版本增量同步至各语言原生的包描述文件中。过程中确实存在一些容易踩的坑（例如：务必为希望保持独立的子目录显式创建独立的 `pnpm-workspace.yaml` 文件；或者使用独立的人访问令牌来处理标签推送等），但这些均不构成阻碍，你依然可以充分利用 `changesets` 带来的流畅工作流体验。我过去曾建议对 monorepo 的版本管理采用

相似文章

用于 Gleam 单仓库的 GitHub Actions

Lobsters Hottest

一位开发者分享了他们在 Gleam 单仓库中测试 BEAM 与 JavaScript 两套运行时的 GitHub Actions 配置，采用矩阵策略并严格执行格式检查。

@askalphaxiv: 介绍针对GitHub仓库的autoresearch功能 - 将任意仓库URL中的'Github'改为'ARGithub' - 研究工件不再仅限于论文…

X AI KOLs Timeline

介绍一款工具，通过将任意仓库URL中的'Github'改为'ARGithub'，即可部署一个智能体，使其熟悉代码库、解决配置问题并运行实验。

包管理器中的补丁与分支策略

Lobsters Hottest

本文探讨了在上游维护者未能解决漏洞时，针对不同语言包管理器修补和分支依赖项的策略。文章对比了系统包管理器强大的修补能力与语言注册表的局限性，并详细介绍了在各种生态系统中使用 Git 覆盖和分支等变通方法。

@itsclelia: I have one big problem with agentic engineering: I want agents to operate autonomously, but I also want granular, rever…

X AI KOLs Timeline

I have one big problem with agentic engineering: I want agents to operate autonomously, but I also want granular, reversible control over every change they make. I could solve this by committing every intermediate step to Git, but that would completely pollute my repo history. So I built 𝗮𝗴𝗴𝗶𝘁: a Git-like CLI for local and remote (S3-backed) agent artifact storage, written in Rust . With aggit, my agents can stash intermediate work, create branches safely, restore previous states, and back

Git 2.54 亮点速览

Lobsters Hottest

Git 2.54 带来全新的实验性 `git history` 命令，可在不碰工作区的情况下重写或拆分提交，另有 137 位贡献者带来的其他改进。

相似文章

用于 Gleam 单仓库的 GitHub Actions

@askalphaxiv: 介绍针对GitHub仓库的autoresearch功能 - 将任意仓库URL中的'Github'改为'ARGithub' - 研究工件不再仅限于论文…

包管理器中的补丁与分支策略

@itsclelia: I have one big problem with agentic engineering: I want agents to operate autonomously, but I also want granular, rever…

Git 2.54 亮点速览

提交意见反馈