调试挂起的Go程序的技巧

Michael Stapelberg 工具

摘要

一份实用指南,涵盖了调试挂起的Go程序的三种方法:使用SIGQUIT打印堆栈跟踪、附加delve调试器以及保存核心转储供后续分析。

<p>我正帮助某人设置我的 <a href="https://github.com/gokrazy/rsync">gokrazy/rsync</a> 实现,以同步 <a href="https://en.wikipedia.org/wiki/Resource_Public_Key_Infrastructure">RPKI 数据</a>(用于保护 BGP 路由基础设施),这时我们发现,在正确的调用下,我的 rsync 接收器会无限期挂起。</p> <p>这个问题解决起来很快,但在这个过程中,我意识到我应该写下一些多年来我逐渐欣赏的 Go 调试技巧!</p> <h2 id="scenario-hanging-go-program">场景:挂起的 Go 程序</h2> <p>如果你想跟着操作,可以通过构建 bug 修复提交之前的旧版本 gokrazy/rsync 来重现问题(你需要 <a href="https://go.dev/dl/">Go 1.22 或更新版本</a>):</p> <pre tabindex="0"><code>git clone https://github.com/gokrazy/rsync cd rsync git reset --hard 6c89d4dda3be055f19684c0ed56d623da458194e^ go install ./cmd/... </code></pre><p>现在我们可以尝试同步仓库:</p> <pre tabindex="0"><code>% gokr-rsync \ -rtO \ --delete \ rsync://rsync.paas.rpki.ripe.net/repository/ \ /tmp/rpki-repo […] 2025/02/08 09:35:10 Opening TCP connection to rsync.paas.rpki.ripe.net:873 2025/02/08 09:35:10 rsync module &#34;repo&#34;, path &#34;repo/&#34; 2025/02/08 09:35:10 (Client) Protocol versions: remote=31, negotiated=27 2025/02/08 09:35:10 Client checksum: md4 2025/02/08 09:35:10 sending daemon args: [--server --sender -tr . repo/] 2025/02/08 09:35:10 exclusion list sent 2025/02/08 09:35:10 receiving file list 2025/02/08 09:35:11 [Receiver] i=0 ? . mode=40755 len=4096 uid=0 gid=0 flags=? […] 2025/02/08 09:35:11 [Receiver] i=89 ? clonoth/1/3139332e33322e3130302e302f32342d3234203d3e203537313936.roa mode=100644 len=1747 uid=0 gid=0 flags=? </code></pre><p>……然后程序就停在那里了。</p> <h2 id="sigquit-stack-trace">技巧 1:按 Ctrl+\(SIGQUIT)打印堆栈跟踪</h2> <p>查看 Go 程序挂起位置的最简单方法是按 <code>Ctrl+\</code>(反斜杠)让终端向其发送 <a href="https://en.wikipedia.org/wiki/Signal_(IPC)#SIGQUIT">SIGQUIT 信号</a>。当 Go 运行时收到 <code>SIGQUIT</code> 时,它会在退出进程之前向终端打印堆栈跟踪。此行为默认启用,可通过 <code>GOTRACEBACK</code> 环境变量自定义,请参阅 <a href="https://pkg.go.dev/runtime"><code>runtime</code> 包文档</a>。</p> <p>以下是我们案例中的输出样子。我把字体调小了,这样你可以识别输出的形状(细节不重要,继续阅读下文):</p> <div style="font-size: 60%"> <pre tabindex="0"><code>^\SIGQUIT: quit PC=0x47664e m=0 sigcode=128 goroutine 0 gp=0x6e6020 m=0 mp=0x6e6ec0 [idle]: internal/runtime/syscall.Syscall6() /home/michael/sdk/go1.23.0/src/internal/runtime/syscall/asm_linux_amd64.s:36 +0xe fp=0x7ffc58665090 sp=0x7ffc58665088 pc=0x47664e internal/runtime/syscall.EpollWait(0x586651e0?, {0x7ffc5866511c?, 0x3000000018?, 0x7ffc586651f0?}, 0x58665110?, 0x7ffc?) /home/michael/sdk/go1.23.0/src/internal/runtime/syscall/syscall_linux.go:32 +0x45 fp=0x7ffc586650e0 sp=0x7ffc58665090 pc=0x4765e5 runtime.netpoll(0xc0000000c0?) /home/michael/sdk/go1.23.0/src/runtime/netpoll_epoll.go:116 +0xd2 fp=0x7ffc58665768 sp=0x7ffc586650e0 pc=0x432332 runtime.findRunnable() /home/michael/sdk/go1.23.0/src/runtime/proc.go:3580 +0x8c5 fp=0x7ffc586658e0 sp=0x7ffc58665768 pc=0x43f045 runtime.schedule() /home/michael/sdk/go1.23.0/src/runtime/proc.go:3995 +0xb1 fp=0x7ffc58665918 sp=0x7ffc586658e0 pc=0x4405b1 runtime.park_m(0xc0000061c0) /home/michael/sdk/go1.23.0/src/runtime/proc.go:4102 +0x1eb fp=0x7ffc58665970 sp=0x7ffc58665918 pc=0x4409cb runtime.mcall() /home/michael/sdk/go1.23.0/src/runtime/asm_amd64.s:459 +0x4e fp=0x7ffc58665988 sp=0x7ffc58665970 pc=0x470e2e goroutine 1 gp=0xc0000061c0 m=nil [IO wait]: runtime.gopark(0x452658?, 0x0?, 0x98?, 0xb3?, 0xb?) /home/michael/sdk/go1.23.0/src/runtime/proc.go:424 +0xce fp=0xc0000eb358 sp=0xc0000eb338 pc=0x46bc0e runtime.netpollblock(0x4a01b8?, 0x4058e6?, 0x0?) /home/michael/sdk/go1.23.0/src/runtime/netpoll.go:575 +0xf7 fp=0xc0000eb390 sp=0xc0000eb358 pc=0x4318f7 internal/poll.runtime_pollWait(0x7ef586628808, 0x72) /home/michael/sdk/go1.23.0/src/runtime/netpoll.go:351 +0x85 fp=0xc0000eb3b0 sp=0xc0000eb390 pc=0x46af05 internal/poll.(*pollDesc).wait(0xc0000ce180?, 0xc00020e99c?, 0x0) /home/michael/sdk/go1.23.0/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc0000eb3d8 sp=0xc0000eb3b0 pc=0x4b0ce7 internal/poll.(*pollDesc).waitRead(...) /home/michael/sdk/go1.23.0/src/internal/poll/fd_poll_runtime.go:89 internal/poll.(*FD).Read(0xc0000ce180, {0xc00020e99c, 0x4, 0x4}) /home/michael/sdk/go1.23.0/src/internal/poll/fd_unix.go:165 +0x27a fp=0xc0000eb470 sp=0xc0000eb3d8 pc=0x4b17da net.(*netFD).Read(0xc0000ce180, {0xc00020e99c?, 0x6eeea0?, 0x1?}) /home/michael/sdk/go1.23.0/src/net/fd_posix.go:55 +0x25 fp=0xc0000eb4b8 sp=0xc0000eb470 pc=0x4f7e85 net.(*conn).Read(0xc000206000, {0xc00020e99c?, 0xc000212000?, 0x6e6ec0?}) /home/michael/sdk/go1.23.0/src/net/net.go:189 +0x45 fp=0xc0000eb500 sp=0xc0000eb4b8 pc=0x5001a5 net.(*TCPConn).Read(0x0?, {0xc00020e99c?, 0xc0000eb568?, 0x46d449?}) &lt;autogenerated&gt;:1 +0x25 fp=0xc0000eb530 sp=0xc0000eb500 pc=0x50bb25 io.ReadAtLeast({0x5d9640, 0xc000206000}, {0xc00020e99c, 0x4, 0x4}, 0x4) /home/michael/sdk/go1.23.0/src/io/io.go:335 +0x90 fp=0xc0000eb578 sp=0xc0000eb530 pc=0x4957d0 io.ReadFull(...) /home/michael/sdk/go1.23.0/src/io/io.go:354 encoding/binary.Read({0x5d9640, 0xc000206000}, {0x5da8b0, 0x7059a0}, {0x55e7c0, 0xc0000eb6a0}) /home/michael/sdk/go1.23.0/src/encoding/binary/binary.go:244 +0xa5 fp=0xc0000eb670 sp=0xc0000eb578 pc=0x5102a5 github.com/gokrazy/rsync/internal/rsyncwire.(*MultiplexReader).ReadMsg(0xc00020a100) /home/michael/kr/rsync/internal/rsyncwire/wire.go:50 +0x48 fp=0xc0000eb6e8 sp=0xc0000eb670 pc=0x514428 github.com/gokrazy/rsync/internal/rsyncwire.(*MultiplexReader).Read(0x7ef5869b9a68?, {0xc000280000, 0x40000, 0x4dd4fb?}) /home/michael/kr/rsync/internal/rsyncwire/wire.go:72 +0x2f fp=0xc0000eb788 sp=0xc0000eb6e8 pc=0x5145af bufio.(*Reader).Read(0xc0002020c0, {0xc00020e998, 0x4, 0x40ece5?}) /home/michael/sdk/go1.23.0/src/bufio/bufio.go:241 +0x197 fp=0xc0000eb7c0 sp=0xc0000eb788 pc=0x4d5a57 io.ReadAtLeast({0x5d93e0, 0xc0002020c0}, {0xc00020e998, 0x4, 0x4}, 0x4) /home/michael/sdk/go1.23.0/src/io/io.go:335 +0x90 fp=0xc0000eb808 sp=0xc0000eb7c0 pc=0x4957d0 io.ReadFull(...) /home/michael/sdk/go1.23.0/src/io/io.go:354 github.com/gokrazy/rsync/internal/rsyncwire.(*Conn).ReadInt32(0xc000208060) /home/michael/kr/rsync/internal/rsyncwire/wire.go:163 +0x4a fp=0xc0000eb850 sp=0xc0000eb808 pc=0x51490a github.com/gokrazy/rsync/internal/receiver.(*Transfer).recvIdMapping1(0xc000202120, 0x5a9b58) /home/michael/kr/rsync/internal/receiver/uidlist.go:16 +0x3d fp=0xc0000eb8c0 sp=0xc0000eb850 pc=0x51fc7d github.com/gokrazy/rsync/internal/receiver.(*Transfer).RecvIdList(0xc000202120) /home/michael/kr/rsync/internal/receiver/uidlist.go:52 +0x1dd fp=0xc0000eba08 sp=0xc0000eb8c0 pc=0x51ffbd github.com/gokrazy/rsync/internal/receiver.(*Transfer).ReceiveFileList(0xc000202120) /home/michael/kr/rsync/internal/receiver/flist.go:229 +0x378 fp=0xc0000ebb10 sp=0xc0000eba08 pc=0x51c5b8 github.com/gokrazy/rsync/internal/receivermaincmd.clientRun({{0x5d9280, 0xc000078058}, {0x5d92a0, 0xc000078060}, {0x5d92a0, 0xc000078068}}, 0xc0000d0d90, {0x7ef53d47efc8, 0xc000206000}, {0x7ffc5866600e, ...}, ...) /home/michael/kr/rsync/internal/receivermaincmd/receivermaincmd.go:341 +0x5cd fp=0xc0000ebc10 sp=0xc0000ebb10 pc=0x550c2d github.com/gokrazy/rsync/internal/receivermaincmd.socketClient({{0x5d9280, 0xc000078058}, {0x5d92a0, 0xc000078060}, {0x5d92a0, 0xc000078068}}, 0xc0000d0d90, {0x7ffc58665ff4?, 0x1?}, {0x7ffc5866600e, ...}) /home/michael/kr/rsync/internal/receivermaincmd/clientserver.g
查看原文
查看缓存全文

缓存时间: 2026/05/16 03:34

# 调试挂起的Go程序的技巧 来源:https://michael.stapelberg.ch/posts/2025-02-27-debug-hanging-go-programs/ 目录 - [场景:挂起的Go程序](#场景挂起的go程序) - [技巧1:按Ctrl+\(SIGQUIT)打印堆栈跟踪](#技巧1按ctrl-sigquit打印堆栈跟踪) - [技巧2:将delve调试器附加到进程](#技巧2将delve调试器附加到进程) - [技巧3:保存核心转储以供以后分析](#技巧3保存核心转储以供以后分析) - [结论](#结论) 我之前在帮别人配置我的[gokrazy/rsync](https://github.com/gokrazy/rsync)实现,用于同步[RPKI数据](https://en.wikipedia.org/wiki/Resource_Public_Key_Infrastructure)(用于保护BGP路由基础设施),这时我们发现,使用正确的调用方式时,我的rsync接收端会无限期挂起。这个问题解决得很快,但在此过程中,我意识到应该记录下一些多年来积累的Go调试技巧! ## 场景:挂起的Go程序 如果你想跟着操作,可以构建一个旧版本的gokrazy/rsync,刚好在bug修复提交之前(你需要[Go 1.22或更新版本](https://go.dev/dl/)): ``` git clone https://github.com/gokrazy/rsync cd rsync git reset --hard 6c89d4dda3be055f19684c0ed56d623da458194e^ go install ./cmd/... ``` 现在我们可以尝试同步该仓库: ``` % gokr-rsync \ -rtO \ --delete \ rsync://rsync.paas.rpki.ripe.net/repository/ \ /tmp/rpki-repo [...] 2025/02/08 09:35:10 Opening TCP connection to rsync.paas.rpki.ripe.net:873 2025/02/08 09:35:10 rsync module "repo", path "repo/" 2025/02/08 09:35:10 (Client) Protocol versions: remote=31, negotiated=27 2025/02/08 09:35:10 Client checksum: md4 2025/02/08 09:35:10 sending daemon args: [--server --sender -tr . repo/] 2025/02/08 09:35:10 exclusion list sent 2025/02/08 09:35:10 receiving file list 2025/02/08 09:35:11 [Receiver] i=0 ? . mode=40755 len=4096 uid=0 gid=0 flags=? [...] 2025/02/08 09:35:11 [Receiver] i=89 ? clonoth/1/3139332e33322e3130302e302f32342d3234203d3e203537313936.roa mode=100644 len=1747 uid=0 gid=0 flags=? ``` ……然后程序就停在那里了。 ## 技巧1:按Ctrl+\(SIGQUIT)打印堆栈跟踪 查看Go程序挂起位置最简单的方法是按下`Ctrl+\`(反斜杠),让终端向其发送`SIGQUIT`信号。当Go运行时收到`SIGQUIT`后,会在退出进程前向终端打印堆栈跟踪。该行为默认启用,可通过`GOTRACEBACK`环境变量定制,详见[`runtime`包文档](https://pkg.go.dev/runtime)。 以下是本例的输出(我将字体调小以便你识别输出的形状,细节不重要,继续往下看): ``` ^\SIGQUIT: quit PC=0x47664e m=0 sigcode=128 goroutine 0 gp=0x6e6020 m=0 mp=0x6e6ec0 [idle]: internal/runtime/syscall.Syscall6() /home/michael/sdk/go1.23.0/src/internal/runtime/syscall/asm_linux_amd64.s:36 +0xe fp=0x7ffc58665090 sp=0x7ffc58665088 pc=0x47664e internal/runtime/syscall.EpollWait(0x586651e0?, {0x7ffc5866511c?, 0x3000000018?, 0x7ffc586651f0?}, 0x58665110?, 0x7ffc?) /home/michael/sdk/go1.23.0/src/internal/runtime/syscall/syscall_linux.go:32 +0x45 fp=0x7ffc586650e0 sp=0x7ffc58665090 pc=0x4765e5 runtime.netpoll(0xc0000000c0?) /home/michael/sdk/go1.23.0/src/runtime/netpoll_epoll.go:116 +0xd2 fp=0x7ffc58665768 sp=0x7ffc586650e0 pc=0x432332 runtime.findRunnable() /home/michael/sdk/go1.23.0/src/runtime/proc.go:3580 +0x8c5 fp=0x7ffc586658e0 sp=0x7ffc58665768 pc=0x43f045 runtime.schedule() /home/michael/sdk/go1.23.0/src/runtime/proc.go:3995 +0xb1 fp=0x7ffc58665918 sp=0x7ffc586658e0 pc=0x4405b1 runtime.park_m(0xc0000061c0) /home/michael/sdk/go1.23.0/src/runtime/proc.go:4102 +0x1eb fp=0x7ffc58665970 sp=0x7ffc58665918 pc=0x4409cb runtime.mcall() /home/michael/sdk/go1.23.0/src/runtime/asm_amd64.s:459 +0x4e fp=0x7ffc58665988 sp=0x7ffc58665970 pc=0x470e2e goroutine 1 gp=0xc0000061c0 m=nil [IO wait]: runtime.gopark(0x452658?, 0x0?, 0x98?, 0xb3?, 0xb?) /home/michael/sdk/go1.23.0/src/runtime/proc.go:424 +0xce fp=0xc0000eb358 sp=0xc0000eb338 pc=0x46bc0e runtime.netpollblock(0x4a01b8?, 0x4058e6?, 0x0?) /home/michael/sdk/go1.23.0/src/runtime/netpoll.go:575 +0xf7 fp=0xc0000eb390 sp=0xc0000eb358 pc=0x4318f7 internal/poll.runtime_pollWait(0x7ef586628808, 0x72) /home/michael/sdk/go1.23.0/src/runtime/netpoll.go:351 +0x85 fp=0xc0000eb3b0 sp=0xc0000eb390 pc=0x46af05 internal/poll.(*pollDesc).wait(0xc0000ce180?, 0xc00020e99c?, 0x0) /home/michael/sdk/go1.23.0/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc0000eb3d8 sp=0xc0000eb3b0 pc=0x4b0ce7 internal/poll.(*pollDesc).waitRead(...) /home/michael/sdk/go1.23.0/src/internal/poll/fd_poll_runtime.go:89 internal/poll.(*FD).Read(0xc0000ce180, {0xc00020e99c, 0x4, 0x4}) /home/michael/sdk/go1.23.0/src/internal/poll/fd_unix.go:165 +0x27a fp=0xc0000eb470 sp=0xc0000eb3d8 pc=0x4b17da net.(*netFD).Read(0xc0000ce180, {0xc00020e99c?, 0x6eeea0?, 0x1?}) /home/michael/sdk/go1.23.0/src/net/fd_posix.go:55 +0x25 fp=0xc0000eb4b8 sp=0xc0000eb470 pc=0x4f7e85 net.(*conn).Read(0xc000206000, {0xc00020e99c?, 0xc000212000?, 0x6e6ec0?}) /home/michael/sdk/go1.23.0/src/net/net.go:189 +0x45 fp=0xc0000eb500 sp=0xc0000eb4b8 pc=0x5001a5 net.(*TCPConn).Read(0x0?, {0xc00020e99c?, 0xc0000eb568?, 0x46d449?}) <autogenerated>:1 +0x25 fp=0xc0000eb530 sp=0xc0000eb500 pc=0x50bb25 io.ReadAtLeast({0x5d9640, 0xc000206000}, {0xc00020e99c, 0x4, 0x4}, 0x4) /home/michael/sdk/go1.23.0/src/io/io.go:335 +0x90 fp=0xc0000eb578 sp=0xc0000eb530 pc=0x4957d0 io.ReadFull(...) /home/michael/sdk/go1.23.0/src/io/io.go:354 encoding/binary.Read({0x5d9640, 0xc000206000}, {0x5da8b0, 0x7059a0}, {0x55e7c0, 0xc0000eb6a0}) /home/michael/sdk/go1.23.0/src/encoding/binary/binary.go:244 +0xa5 fp=0xc0000eb670 sp=0xc0000eb578 pc=0x5102a5 github.com/gokrazy/rsync/internal/rsyncwire.(*MultiplexReader).ReadMsg(0xc00020a100) /home/michael/kr/rsync/internal/rsyncwire/wire.go:50 +0x48 fp=0xc0000eb6e8 sp=0xc0000eb670 pc=0x514428 github.com/gokrazy/rsync/internal/rsyncwire.(*MultiplexReader).Read(0x7ef5869b9a68?, {0xc000280000, 0x40000, 0x4dd4fb?}) /home/michael/kr/rsync/internal/rsyncwire/wire.go:72 +0x2f fp=0xc0000eb788 sp=0xc0000eb6e8 pc=0x5145af bufio.(*Reader).Read(0xc0002020c0, {0xc00020e998, 0x4, 0x40ece5?}) /home/michael/sdk/go1.23.0/src/bufio/bufio.go:241 +0x197 fp=0xc0000eb7c0 sp=0xc0000eb788 pc=0x4d5a57 io.ReadAtLeast({0x5d93e0, 0xc0002020c0}, {0xc00020e998, 0x4, 0x4}, 0x4) /home/michael/sdk/go1.23.0/src/io/io.go:335 +0x90 fp=0xc0000eb808 sp=0xc0000eb7c0 pc=0x4957d0 io.ReadFull(...) /home/michael/sdk/go1.23.0/src/io/io.go:354 github.com/gokrazy/rsync/internal/rsyncwire.(*Conn).ReadInt32(0xc000208060) /home/michael/kr/rsync/internal/rsyncwire/wire.go:163 +0x4a fp=0xc0000eb850 sp=0xc0000eb808 pc=0x51490a github.com/gokrazy/rsync/internal/receiver.(*Transfer).recvIdMapping1(0xc000202120, 0x5a9b58) /home/michael/kr/rsync/internal/receiver/uidlist.go:16 +0x3d fp=0xc0000eb8c0 sp=0xc0000eb850 pc=0x51fc7d github.com/gokrazy/rsync/internal/receiver.(*Transfer).RecvIdList(0xc000202120) /home/michael/kr/rsync/internal/receiver/uidlist.go:52 +0x1dd fp=0xc0000eba08 sp=0xc0000eb8c0 pc=0x51ffbd github.com/gokrazy/rsync/internal/receiver.(*Transfer).ReceiveFileList(0xc000202120) /home/michael/kr/rsync/internal/receiver/flist.go:229 +0x378 fp=0xc0000ebb10 sp=0xc0000eba08 pc=0x51c5b8 github.com/gokrazy/rsync/internal/receivermaincmd.clientRun({{0x5d9280, 0xc000078058}, {0x5d92a0, 0xc000078060}, {0x5d92a0, 0xc000078068}}, 0xc0000d0d90, {0x7ef53d47efc8, 0xc000206000}, {0x7ffc5866600e, ...}, ...) /home/michael/kr/rsync/internal/receivermaincmd/receivermaincmd.go:341 +0x5cd fp=0xc0000ebc10 sp=0xc0000ebb10 pc=0x550c2d github.com/gokrazy/rsync/internal/receivermaincmd.socketClient({{0x5d9280, 0xc000078058}, {0x5d92a0, 0xc000078060}, {0x5d92a0, 0xc000078068}}, 0xc0000d0d90, {0x7ffc58665ff4?, 0x1?}, {0x7ffc5866600e, ...}) /home/michael/kr/rsync/internal/receivermaincmd/clientserver.go:44 +0x425 fp=0xc0000ebcd0 sp=0xc0000ebc10 pc=0x54c205 github.com/gokrazy/rsync/internal/receivermaincmd.rsyncMain({{0x5d9280, 0xc000078058}, {0x5d92a0, 0xc000078060}, {0x5d92a0, 0xc000078068}}, 0xc0000d0d90, {0xc00007e440, 0x1, 0x2}, ...) /home/michael/kr/rsync/internal/receivermaincmd/receivermaincmd.go:160 +0x5d7 fp=0xc0000ebdf0 sp=0xc0000ebcd0 pc=0x54f697 github.com/gokrazy/rsync/internal/receivermaincmd.Main({0xc0000160a0, 0x5, 0x5}, {0x5d9280?, 0xc000078058?}, {0x5d92a0?, 0xc000078060?}, {0x5d92a0?, 0xc000078068?}) /home/michael/kr/rsync/internal/receivermaincmd/receivermaincmd.go:394 +0x272 fp=0xc0000ebee8 sp=0xc0000ebdf0 pc=0x5510d2 main.main() /home/michael/kr/rsync/cmd/gokr-rsync/rsync.go:12 +0x4e fp=0xc0000ebf50 sp=0xc0000ebee8 pc=0x5515ae runtime.main() /home/michael/sdk/go1.23.0/src/runtime/proc.go:272 +0x28b fp=0xc0000ebfe0 sp=0xc0000ebf50 pc=0x438d4b runtime.goexit({}) /home/michael/sdk/go1.23.0/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000ebfe8 sp=0xc0000ebfe0 pc=0x472e61 goroutine 2 gp=0xc000006c40 m=nil [force gc (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) /home/michael/sdk/go1.23.0/src/runtime/proc.go:424 +0xce fp=0xc000074fa8 sp=0xc000074f88 pc=0x46bc0e runtime.goparkunlock(...) /home/michael/sdk/go1.23.0/src/runtime/proc.go:430 runtime.forcegchelper() /home/michael/sdk/go1.23.0/src/runtime/proc.go:337 +0xb3 fp=0xc000074fe0 sp=0xc000074fa8 pc=0x439093 runtime.goexit({}) /home/michael/sdk/go1.23.0/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000074fe8 sp=0xc000074fe0 pc=0x472e61 created by runtime.init.7 in goroutine 1 /home/michael/sdk/go1.23.0/src/runtime/proc.go:325 +0x1a ``` 呼!这个输出相当密集。我们可以使用 [https://github.com/maruel/panicparse](https://github.com/maruel/panicparse) 程序来以更丰富多彩且更简短的方式呈现这个堆栈跟踪: [](https://michael.stapelberg.ch/posts/2025-02-27-debug-hanging-go-programs/2025-02-08-panicparse.jpg) 红色高亮显示的函数正是问题所在:我的rsync接收端实现错误地期望服务器发送uid/gid列表,尽管PreserveUid和PreserveGid选项并未启用。 提交 [`6c89d4d`](https://github.com/gokrazy/rsync/commit/6c89d4dda3be055f19684c0ed56d623da458194e) 修复了这个问题。 ## 技巧2:将delve调试器附加到进程 如果仅仅打印堆栈跟踪不足以诊断问题,你可以进一步使用交互式调试器。最著名的Linux调试器大概是GDB,但在处理Go程序时,我推荐使用 [delve 调试器](https://github.com/go-delve/delve),因为它通常工作得更好。 如果还没安装,请先安装 delve: ``` % go install github.com/go-delve/delve/cmd/dlv@latest ``` 本文使用 delve v1.24.0。你可以用调试器运行新的子进程(使用 `dlv exec`),无需特殊权限;但在 Linux 中,为了安全原因,默认禁止附加到现有进程([参见文档](https://www.kernel.org/doc/Documentation/security/Yama.txt))。我们可以允许这个特性(记得之后关闭!)通过: ``` % sudo sysctl -w kernel.yama.ptrace_scope=0 kernel.yama.ptrace_scope = 0 ``` 然后我们可以直接用 `dlv attach` 附加到挂起的 `gokr-rsync` 进程: ``` % dlv attach $(pidof gokr-rsync) Type 'help' for list of commands. (dlv) ``` 很好。但如果只是打印堆栈跟踪,我们只能看到 `runtime` 包的函数: ``` (dlv) bt 0 0x000000000047bb83 in runtime.futex at /home/michael/sdk/go1.23.6/src/runtime/sys_linux_amd64.s:558 1 0x00000000004374d0 in runtime.futexsleep at /home/michael/sdk/go1.23.6/src/runtime/os_linux.go:69 2 0x000000000040d89d in runtime.notesleep at /home/michael/sdk/go1.23.6/src/runtime/lock_futex.go:170 3 0x000000000044123e in runtime.mPark at /home/michael/sdk/go1.23.6/src/runtime/proc.go:1866 4 0x000000000044290d in runtime.stopm at /home/michael/sdk/go1.23.6/src/runtime/proc.go:2886 5 0x00000000004433d0 in runtime.findRunnable at /home/michael/sdk/go1.23.6/src/runtime/proc.go:3623 6 0x0000000000444e1d in runtime.schedule at /home/michael/sdk/go1.23.6/src/runtime/proc.go:3996 7 0x00000000004451cb in runtime.park_m at /home/michael/sdk/go1.23.6/src/runtime/proc.go:4103 8 0x0000000000477eee in runtime.mcall at /home/michael/sdk/go1.23.6/src/runtime/asm_amd64.s:459 ``` 原因是当前没有 goroutine 在运行(程序在无限等待从服务器接收数据),所以我们看到的是 Go 调度器中的一个 OS 线程在等待。我们首先需要切换到感兴趣的 goroutine(`grs` 打印所有 goroutine),然后堆栈跟踪看起来就会是我们期望的样子: ``` (dlv) gr 1 Switched from 0 to 1 (thread 414327) (dlv) bt 0 0x0000000000474ebc in runtime.gopark at /home/michael/sdk/go1.23.6/src/runtime/proc.go:425 1 0x000000000043819e in runtime.netpollblock at /home/michael/sdk/go1.23.6/src/runtime/netpoll.go:575 2 0x000000000047435c in internal/poll.runtime_pollWait at /home/michael/sdk/go1.23.6/src/runtime/netpoll.go:351 3 0x00000000004ed15a in internal/poll.(*pollDesc).wait at /home/michael/sdk/go1.23.6/src/internal/poll/fd_poll_runtime.go:84 4 0x00000000004ed1f1 in internal/poll.(*pollDesc).waitRead at /home/michael/sdk/go1.23.6/src/internal/poll/fd_poll_runtime.go:89 5 0x00000000004ee351 in internal/poll.(*FD).Read at /home/michael/sdk/go1.23.6/src/internal/poll/fd_unix.go:165 6 0x0000000000569bb3 in net.(*netFD).Read at /home/michael/sdk/go1.23.6/src/net/fd_posix.go:55 7 0x000000000057a025 in net.(*conn).Read at /home/michael/sdk/go1.23.6/src/net/net.go:189 8 0x000000000058fcc5 in net.(*TCPConn).Read at <autogenerated>:1 9 0x00000000004b72e8 in io.ReadAtLeast at /home/michael/sdk/go1.23.6/src/io/io.go:335 10 0x00000000004b74d3 in io.ReadFull at /home/michael/sdk/go1.23.6/src/io/io.go:354 11 0x0000000000598d5f in encoding/binary.Read at /home/michael/sdk/go1.23.6/src/encoding/binary/binary.go:244 12 0x00000000005a0b7a in github.com/gokrazy/rsync/internal/rsyncwire.(*MultiplexReader).ReadMsg at /home/michael/kr/rsync/internal/rsyncwire/wire.go:50 13 0x00000000005a0f17 in github.com/gokrazy/rsync/internal/rsyncwire.(*MultiplexReader).Read at /home/michael/kr/rsync/internal/rsyncwire/wire.go:72 14 0x0000000000528de8 in bufio.(*Reader).Read at /home/michael/sdk/go1.23.6/src/bufio/bufio.go:241 15 0x00000000004b72e8 in io.ReadAtLeast at /home/michael/sdk/go1.23.6/src/io/io.go:335 16 0x00000000004b74d3 in io.ReadFull at /home/michael/sdk/go1.23.6/src/io/io.go:354 17 0x00000000005a19ef in github.com/gokrazy/rsync/internal/rsyncwire.(*Conn).ReadInt32 at /home/michael/kr/rsync/internal/rsyncwire/wire.go:163 18 0x00000000005b77d2 in github.com/gokrazy/rsync/internal/receiver.(*Transfer).recvIdMapping1 at /home/michael/kr/rsync/internal/receiver/uidlist.go:16 19 0x00000000005b7ea8 in github.com/gokrazy/rsync/internal/receiver.(*Transfer).RecvIdList at /home/michael/kr/rsync/internal/receiver/uidlist.go:52 20 0x00000000005b18db in github.com/gokrazy/rsync/internal/receiver.(*Transfer).ReceiveFileList at /home/michael/kr/rsync/internal/receiver/flist.go:229 21 0x0000000000605390 in github.com/gokrazy/rsync/internal/receivermaincmd.clientRun at /home/michael/kr/rsync/internal/receivermaincmd/receivermaincmd.go:341 22 0x00000000005fe572 in github.com/gokrazy/rsync/internal/receivermaincmd.socketClient at /home/michael/kr/rsync/internal/receivermaincmd/clientserver.go:44 23 0x0000000000602f10 in github.com/gokrazy/rsync/internal/receivermaincmd.rsyncMain at /home/michael/kr/rsync/internal/receivermaincmd/receivermaincmd.go:160 24 0x0000000000605e7e in github.com/gokrazy/rsync/internal/receivermaincmd.Main at /home/michael/kr/rsync/internal/receivermaincmd/receivermaincmd.go:394 25 0x0000000000605c85 in main.main at /home/michael/kr/rsync/cmd/gokr-rsync/rsync.go:12 26 0x000000000043d5db in runtime.main at /home/michael/sdk/go1.23.6/src/runtime/proc.go:272 ``` 现在我们可以检查当前上下文中的变量: ``` (dlv) args t = (*github.com/gokrazy/rsync/internal/receiver.Transfer)(0xc000202120) rf = (*bufio.Reader)(0xc0002020c0) ... (dlv) print t *github.com/gokrazy/rsync/internal/receiver.Transfer { client: (*github.com/gokrazy/rsync/internal/rsyncwire.Conn)(0xc000208060), ... uidMap: []github.com/gokrazy/rsync/internal/receiver.uidMapEntry nil, gidMap: []github.com/gokrazy/rsync/internal/receiver.uidMapEntry nil, ...} ``` `uidMap` 和 `gidMap` 为 nil,但在函数内部,代码试图从连接中读取这些值。调用栈显示我们从 `recvIdMapping1` 开始,它期望读取一个 uid 映射条目,但我们却卡在从 TCP 套接字读取数据上。服务器显然没有发送我们期望的数据。 Delve 提供了许多其他有用的特性,比如逐步执行代码、设置断点、修改变量等。在处理挂起程序时,能检查任意运行时状态已经非常有帮助了。 ## 技巧3:保存核心转储以供以后分析 如果问题间歇性发生,或者你无法立即连接到生产机器,你可以保存核心转储并在之后分析。Go 1.23 引入了通过 `GOTRACEBACK=crash` 环境变量生成核心转储的支持(参阅 [Go 1.23 发布说明](https://go.dev/doc/go1.23#runtime))。此外,Go 运行时还提供了一个实验性的 `runtime/debug.WriteHeapDump()` 函数。 不过,对于大多数挂起场景,最灵活的跨版本方法是使用 `gcore`(来自 gdb 包)来生成核心转储文件。你需要安装 gdb(仅用于生成转储,之后仍然可以用 delve 分析): ``` % gcore -o /tmp/hang.core $(pidof gokr-rsync) ``` 之后你可以将核心文件复制到另一台机器,并在那里用 delve 加载它: ``` % dlv core /path/to/gokr-rsync /tmp/hang.core ``` 接着你就可以像在实时调试中一样检查 goroutine 和变量。 ## 结论 调试挂起的 Go 程序时,你可以使用以下技巧: 1. 按下 `Ctrl+\` 快速获取堆栈跟踪。 2. 使用 delve 进行交互式检查。 3. 保存核心转储供以后分析。 这些技巧多年来为我节省了大量时间。希望它们也能帮到你!

相似文章

优化CPU密集型Go热路径的笔记

Hacker News Top

本文讨论了CPU密集型Go代码的性能优化技术,指出了泛型和接口抽象因无法内联而产生的局限性,并主张在热路径中使用代码复制。文章通过一个Brotli移植示例和深入基准测试进行了说明。

就用Go

Lobsters Hottest

一篇带有强烈观点的开发者文章倡导使用Go编程语言,强调其简洁的语法、强大的标准库、高效的并发模型以及单二进制部署,作为对过于复杂的现代技术栈的实用替代方案。

用户更改键盘布局时程序挂起的问题

The Old New Thing (Raymond Chen)

一个调试故事,讲述了当用户更改键盘布局(例如使用 Win+Space 快捷键)时,Windows 程序挂起的原因,是由于一个后台线程创建了窗口但没有泵送消息。修复方法是要么泵送消息,要么销毁窗口。

Go 工具链的 Fuzzing 分支

Lobsters Hottest

Trail of Bits 推出了 gosentry,这是一个面向模糊测试的 Go 工具链分支,集成 LibAFL 以增强路径约束求解、结构化模糊测试及漏洞检测能力,同时保留标准测试工作流程。