CPPL: A Circuit Prompt Programming Language
Summary
CPPL is a compiler-mediated framework that bridges LLMs and hardware design by using a Python DSL and JSON-based intermediate representation to enable statically checkable, optimizable RTL generation.
View Cached Full Text
Cached at: 05/25/26, 06:48 PM
# CPPL: A Circuit Prompt Programming Language
Source: [https://arxiv.org/html/2605.17892](https://arxiv.org/html/2605.17892)
Shuo Yin1, Yihe Wang2, Lancheng Zou1, Xufeng Yao1, Tinghuan Chen2, Chen Bai3,†, Zhengrong Wang1, Tsung\-Yi Ho1, Bei Yu1,†, 1The Chinese University of Hong Kong 2The Chinese University of Hong Kong \(Shenzhen\) 3Fudan University
###### Abstract
Large language models \(LLMs\) have shown promise in register\-transfer level \(RTL\) design automation, but direct RTL generation remains difficult to validate, optimize, and integrate with compiler\-based hardware design flows\. Hardware compiler infrastructures such as CIRCT provide typed intermediate representations, legality checks, and optimization passes, yet current LLMs struggle to emit raw compiler IR because of MLIR syntax, SSA discipline, dialect\-specific operations, and strict width constraints\. This paper presentsCPPL, a compiler\-mediated design framework that turns LLM\-assisted hardware generation into a statically checkable frontend problem rather than an unconstrained RTL text\-generation task\.CPPLcombines a Python frontend DSL for declaring module interfaces and hierarchy withCPPL IR, a JSON\-based circuit IR designed to expose compiler\-visible structure while remaining accessible to LLMs\. The compiler infers operation widths from declared module ports, validates generated IR, checks hierarchy and port bindings, and deterministically lowers the result to CIRCT for synthesizable Verilog generation\. On the RTLLM benchmark,CPPLimproves functional correctness over direct Verilog and direct CIRCT IR generation, while CIRCT optimization reduces post\-synthesis AIG node counts\. These results show that a compiler\-mediated interface can make LLM\-assisted hardware design more reliable, analyzable, and amenable to backend optimization\.CPPLis available at[https://github\.com/SawyDust1228/CPPL](https://github.com/SawyDust1228/CPPL)\.
## IIntroduction
Large language models \(LLMs\)\[[1](https://arxiv.org/html/2605.17892#bib.bib1),[2](https://arxiv.org/html/2605.17892#bib.bib2),[3](https://arxiv.org/html/2605.17892#bib.bib3)\]are increasingly used for register\-transfer level \(RTL\) design automation, including Verilog generation, repair, and verification\-oriented coding\[[4](https://arxiv.org/html/2605.17892#bib.bib4),[5](https://arxiv.org/html/2605.17892#bib.bib5),[6](https://arxiv.org/html/2605.17892#bib.bib6),[7](https://arxiv.org/html/2605.17892#bib.bib7),[8](https://arxiv.org/html/2605.17892#bib.bib8)\]\. These systems lower the barrier for hardware design by translating natural\-language specifications into executable hardware descriptions\. However, most existing LLM4RTL flows follow an end\-to\-end generation paradigm: the model emits final RTL text, which is then checked by simulators or synthesis tools\. This paradigm is convenient, but from a design automation perspective it leaves three recurring problems unresolved\. First, generated RTL can be syntactically invalid or incompatible with downstream tools\. Second, syntactically valid RTL often fails functional tests because the model must simultaneously reason about behavior, structure, widths, and corner cases\. Third, end\-to\-end RTL generation exposes little intermediate structure to hardware compiler infrastructures that already provide typed representations, legality checks, canonicalization, and optimization\.
Hardware compiler infrastructures such as CIRCT\[[9](https://arxiv.org/html/2605.17892#bib.bib9)\]provide a natural way to address these issues\. CIRCT represents circuits in MLIR\-based dialects and can lower well\-formed intermediate representations into synthesizable Verilog while applying standard compiler transformations\. In principle, an LLM could generate CIRCT IR directly and thereby benefit from compiler verification and optimization\. In practice, this is difficult\. CIRCT IR exposes MLIR syntax, SSA naming, dialect\-specific operation constraints, and strict type requirements\. Our profiling results in[SectionII](https://arxiv.org/html/2605.17892#S2)show that even strong commercial models generate CIRCT IR with much lower correctness than Verilog, despite receiving format\-specific prompts\. This gap suggests that compiler\-backed LLM hardware generation needs a frontend interface that is more structured than natural language or raw RTL, yet easier for LLMs to produce than low\-level compiler IR\.
Figure 1:CPPLcombines LLM\-based generation with compiler\-mediated circuit construction and optimization\.This paper presentsCPPL, a compiler\-mediated framework for LLM\-assisted hardware generation\.CPPLintroduces a Python\-based frontend DSL that captures module interfaces and structural hierarchy explicitly, while leaving implementation intent in an LLM\-friendly form\. The frontend elaborates fixed ports, module instances, and connection structure before LLM generation, preventing the model from freely inventing incompatible interfaces or hierarchy\. For the model\-generated part,CPPLusesCPPL IR, a JSON\-based intermediate representation that encodes circuit operations in a regular schema\.CPPL IRis designed to retain compiler\-visible circuit structure while avoiding raw MLIR syntax; the compiler performs syntax validation, width inference, structural checks, and deterministic lowering to CIRCT IR\. The resulting CIRCT program is then compiled and optimized to Verilog\. In this way,CPPLshifts LLM generation from unstructured RTL or raw CIRCT IR toward a typed, structurally constrained frontend that can be checked before backend code generation\.[Fig\.1](https://arxiv.org/html/2605.17892#S1.F1)summarizes this shift\. Unlike prior LLM4RTL flows that rely on natural\-language prompts and direct model inference to produce RTL,CPPLasks users to describe hardware through a programming\-language frontend and delegates RTL construction to a compiler backend\. This separation keeps interface and hierarchy information explicit before LLM generation, and moves legality checking, refinement, and optimization into the compilation path\.
Our evaluation on the RTLLM benchmark\[[5](https://arxiv.org/html/2605.17892#bib.bib5)\]shows that direct Verilog generation achieves high syntax correctness but still suffers from a substantial functionality gap, while direct CIRCT IR generation is significantly less reliable\.CPPLcloses this gap by assigning interface construction, hierarchy elaboration, legality checking, and type recovery to the compiler while leaving the model to generate a constrained circuit representation\. Across the evaluated models,CPPLimproves functional correctness over both direct Verilog generation and direct CIRCT IR generation\. We also evaluate synthesis quality using post\-aigmapnode counts and show that CIRCT optimization passes can reduce synthesized circuit size\.
This paper makes the following contributions:
- •We identify the mismatch between LLM generation capabilities and raw CIRCT IR generation, showing that compiler IR syntax and semantics remain difficult for current LLMs\.
- •We proposeCPPL, an open\-source compiler\-mediated hardware generation framework that combines a Python frontend DSL, a JSON\-based circuit IR, static checking, and CIRCT\-based lowering\.
- •We formalize the structural preservation and width inference principles used byCPPLto keep module hierarchy, port bindings, and operation types consistent during generation\.
- •We evaluateCPPLon RTLLM and demonstrate improved functional correctness and synthesis\-level compactness compared with direct generation baselines\.
## IIBackground & Motivation
### II\-ARelated Works on LLM4RTL
Recent LLMs have shown promising capabilities in generating RTL code from high\-level descriptions, attracting significant attention in the research community\. Several approaches have been proposed to improve LLM\-based RTL generation:\[[8](https://arxiv.org/html/2605.17892#bib.bib8),[10](https://arxiv.org/html/2605.17892#bib.bib10)\]explore LLM fine\-tuning and graph embedding techniques to enhance generation performance, while\[[6](https://arxiv.org/html/2605.17892#bib.bib6),[7](https://arxiv.org/html/2605.17892#bib.bib7)\]introduce code\-to\-code alignment methods to improve the quality of generated RTL code\.\[[11](https://arxiv.org/html/2605.17892#bib.bib11),[12](https://arxiv.org/html/2605.17892#bib.bib12)\]leverage agent systems to decompose complex hardware generation tasks into manageable subtasks, improving RTL generation accuracy\. Beyond code generation, LLMs also play crucial roles in other RTL\-related tasks\.\[[13](https://arxiv.org/html/2605.17892#bib.bib13),[14](https://arxiv.org/html/2605.17892#bib.bib14),[15](https://arxiv.org/html/2605.17892#bib.bib15)\]optimize generated RTL code using LLMs and symbolic reasoning techniques, while\[[16](https://arxiv.org/html/2605.17892#bib.bib16),[17](https://arxiv.org/html/2605.17892#bib.bib17),[18](https://arxiv.org/html/2605.17892#bib.bib18)\]use LLMs for RTL debugging and verification\. These works demonstrate the potential of LLMs across RTL generation, optimization, debugging, and verification\. They primarily treat RTL as the generation target, whereasCPPLstudies a complementary question: how to expose compiler\-level hardware design flows to LLMs without requiring the model to directly emit low\-level CIRCT IR\.
### II\-BCircuit IR Compilers and Tools
Circuit IR Compilers and Tools \(CIRCT\)\[[9](https://arxiv.org/html/2605.17892#bib.bib9)\]is a compiler infrastructure built on top of the Multi\-Level Intermediate Representation \(MLIR\)\[[19](https://arxiv.org/html/2605.17892#bib.bib19)\]for hardware design\[[20](https://arxiv.org/html/2605.17892#bib.bib20),[21](https://arxiv.org/html/2605.17892#bib.bib21),[22](https://arxiv.org/html/2605.17892#bib.bib22),[23](https://arxiv.org/html/2605.17892#bib.bib23)\], optimization\[[24](https://arxiv.org/html/2605.17892#bib.bib24),[25](https://arxiv.org/html/2605.17892#bib.bib25)\], and simulation\[[26](https://arxiv.org/html/2605.17892#bib.bib26),[27](https://arxiv.org/html/2605.17892#bib.bib27)\]\. CIRCT provides core dialects to represent circuits at a unified level of abstraction, including:
- •Thehwdialect offers function\-like semantics to represent module information and data types\. For instance,hw\.modulehandles the details of a module, whilehw\.instancerepresents the instantiation of these modules\.
- •Thecombdialect represents combinational components in RTL\. For example,comb\.addmodels a multi\-input adder in combinational logic\.
- •Theseqdialect represents sequential logic\.seq\.compregmodels registers in CIRCT, containing the piped value and reset as inputs\. Theseqdialect also includes a memory type that describes memory behavior\.
- •Thesvdialect represents the semantics of SystemVerilog\. For example,sv\.alwaysrepresents an always block, which is commonly used to define sequential logic\.
The core dialects of CIRCT effectively support flexible transformations and optimization, while targeting different backends for Verilog generation, simulation, and verification\.
### II\-CMotivations for CPPL
The previous works on LLMs for RTL generation discussed in[SectionII\-A](https://arxiv.org/html/2605.17892#S2.SS1)primarily follow an end\-to\-end paradigm, where the LLM directly generates the final RTL code\. This approach is convenient, but it exposes several limitations when used as a design methodology:
1. 1Generated RTL may contain syntax or tool\-compatibility errors that are only discovered after downstream compilation\.
2. 2Functional behavior, bitwidth consistency, and structural hierarchy are entangled in a single text\-generation task, making failures difficult to localize\.
3. 3Compiler analyses and transformations are applied only after RTL emission, so the generation process receives little benefit from typed IRs, legality checks, and canonical compiler optimizations\.
Compiler\-based hardware design flows address many of these issues by making structure, legality, and types explicit before backend code generation\. For example, CIRCT can lower well\-formed circuit IR into synthesizable Verilog for backend EDA tools\. It also provides standard optimization passes, includingconstant folding\(CF\),dead code elimination\(DCE\), andcommon subexpression elimination\(CSE\), to improve the generated design\. Furthermore, CIRCT IR follows a strict type discipline derived from the LLVM/MLIR ecosystem\[[28](https://arxiv.org/html/2605.17892#bib.bib28)\], enabling early verification before final RTL emission\.
Based on the above discussion, we propose a compiler\-mediated generation paradigm that combines the natural\-language and code\-generation capabilities of LLMs with the typed compilation flow provided by CIRCT\. We designate this paradigm asCPPL\(CircuitPromptProgrammingLanguage\), a hardware generation framework that inserts an LLM\-friendly, statically checkable circuit IR between model generation and CIRCT lowering\.CPPLexposes structural and type constraints before backend code generation, while still allowing designers to express implementation intent at a high level\.
### II\-DChallenges for Leveraging CIRCT IR in CPPL
Although CIRCT provides the desired compiler support, directly exposing CIRCT IR as the LLM output target presents several challenges:
1. 1There are limited public datasets of CIRCT IR examples, which makes it difficult to train LLMs to understand and generate code in this intermediate representation\.
2. 2CIRCT is an actively developing project, and its IR syntax and dialect definitions continue to evolve\. This makes it difficult for LLMs to maintain compatibility with a specific compiler version\.
3. 3As discussed in\[[29](https://arxiv.org/html/2605.17892#bib.bib29)\], the Static Single Assignment \(SSA\) form used by CIRCT IR can be difficult for LLMs to understand and generate\.
We conduct a profiling experiment to assess LLM performance in end\-to\-end generation of Verilog code versus CIRCT IR\. In this experiment, we use system prompts to instruct LLMs to generate CIRCT IR compatible with our experimental setup in[SectionIV\-A](https://arxiv.org/html/2605.17892#S4.SS1)\. We use the RTLLM benchmark\[[5](https://arxiv.org/html/2605.17892#bib.bib5)\]to evaluate thepass@1metric on a set of recent commercial LLMs \(see[SectionIV\-A](https://arxiv.org/html/2605.17892#S4.SS1)for configuration details\)\. As[Fig\.2](https://arxiv.org/html/2605.17892#S2.F2)shows, all evaluated models exhibit a clear performance gap between generating Verilog code and CIRCT IR, withpass@1scores for CIRCT IR generation being substantially lower\. This result indicates that the direct compiler\-IR path is not yet a reliable frontend for LLM\-assisted hardware design\.
Figure 2:pass@1score gap between Verilog and CIRCT IR for syntax correctness\.Figure 3:The geometric average error type breakdown of CIRCT IR generation across all evaluated models on the RTLLM benchmark\.We further evaluate the detailed error types in generated CIRCT IR and categorize them into three main categories: violations of MLIR format semantics \(MLIR Syntax Error\), use of unsupported or incorrect CIRCT operations \(CIRCT Operation Error\), and type system mismatches \(Type Error\)\. As shown in[Fig\.3](https://arxiv.org/html/2605.17892#S2.F3), the largest error category is CIRCT operation errors, indicating that LLMs often fail to satisfy dialect\-specific operation constraints\. MLIR syntax errors and type errors also contribute to the overall error rate, confirming that both the concrete IR format and the compiler type system are challenging generation targets\.
These challenges motivate an intermediate representation that keeps the compiler\-visible structure and type information needed by CIRCT, but presents them in a form that is easier for LLMs to generate and easier for the compiler to validate\.
## IIICPPL Framework
Figure 4:CPPLframework overview\.In this section, we introduce theCPPLframework, a compiler\-mediated design flow for LLM\-assisted hardware generation with CIRCT as the backend compilation infrastructure\.[Fig\.4](https://arxiv.org/html/2605.17892#S3.F4)illustrates the overall workflow:CPPLprovides a Python\-based frontend DSL for specifying interfaces and hierarchy, an LLM\-facing JSON IR for behavioral generation, a compiler\-driven refinement loop, and a deterministic lowering path to CIRCT\.CPPLis implemented through APPL\[[30](https://arxiv.org/html/2605.17892#bib.bib30)\]and CIRCT’s Python bindings, which together provide a flexible platform for hardware design\.
fromcpplimportmodule,In,Out
\\par@module
defAdder8\(a:In\[8\],b:In\[8\]\)\-\>\{”sum”:Out\[8\]\}:
”””outequalsaplusb\(8\-bitaddition\)\.”””
\\par@module
defALU\(op\_code:In\[2\],op\_a:In\[8\],op\_b:In\[8\]\)\-\>\{”res”:Out\[8\],”zero”:Out\[1\]\}:
returnf”””
SimpleALUthatusesanAdder8instanceforaddition\.
Basedonop\_code\(2\-bitselector\):
\-00:res=\{Adder8\(op\_a,op\_b\)\}\(resultfromAdder8instance\)
\-01:res=op\_a\-op\_b
\-10:res=op\_a&op\_b\(bitwiseAND\)
\-11:res=op\_a\|op\_b\(bitwiseOR\)
zerois1whenresequals0,otherwise0\.
”””
Figure 5:TheCPPLdescription of a 2\-bit opcode ALU with an 8\-bit adder instance\.### III\-AFrontend DSL for Hardware Design
Language Design\.[Fig\.5](https://arxiv.org/html/2605.17892#S3.F5)shows aCPPLprogram that defines anALUmodule\.CPPLfollows a function\-oriented style: each hardware module is written as a Python function decorated with the@moduledecorator, which triggers JIT compilation by theCPPLcompiler\. The function signature specifies the module boundary\. Arguments annotated withIndefine input ports, and return values annotated withOutdefine output ports, together with their bit widths\. Because the interface is declared explicitly, generated hardware is constrained to use fixed and well\-typed ports, reducing the opportunity for the LLM to introduce missing, extra, or width\-mismatched interfaces\.
The function body provides the implementation intent through a docstring\. This docstring can mix natural\-language descriptions with structuredCPPLconstructs that describe module instantiations and connections\. For example, in[Fig\.5](https://arxiv.org/html/2605.17892#S3.F5), theALUmodule instantiates anAdder8module for operand addition\. The instantiation is expressed using Python string\-formatting syntax, where a formatted module call explicitly binds the child module to ports declared in the enclosing function signature or in the local implementation context\. In this way,CPPLkeeps the high\-level specification natural for designers while making the module hierarchy and interface bindings explicit enough for deterministic frontend elaboration\.
Structural Semantics Preservation\. We next formalize the structural part of the frontend DSL and show that the hierarchy encoded byCPPLis preserved in the generated circuit representation\. The key point is that string\-formatting constructs are not treated as unconstrained free\-form text\. Instead, each formatted module call is interpreted as an explicit structural directive: it is lowered to a module\-instance vertex, while the associated port bindings are lowered to wiring edges\. Therefore, the frontend does not rely on the LLM to rediscover hierarchy from natural language; it deterministically elaborates the hierarchy specified by theCPPLsyntax\. Letssbe a well\-formed architectural phrase under scope environmentΓ\\Gamma\. The judgmentΓ;ρ⊢s⇓H\\Gamma;\\rho\\vdash s\\Downarrow Hmeans that, under parent module or instanceρ\\rho, phrasesselaborates to a circuit graphH=\(V,Eh,Ew\)H=\(V,E\_\{h\},E\_\{w\}\), whereVVis the set of module/instance vertices,EhE\_\{h\}is the set of hierarchy edges, andEwE\_\{w\}is the set of wiring edges\. Let𝖲𝗍𝗋𝗎𝖼𝗍\(H\)=\(V,Eh\)\\mathsf\{Struct\}\(H\)=\(V,E\_\{h\}\)denote the hierarchy\-only projection, and let𝖳𝗋𝖾𝖾ρ\(s\)\\mathsf\{Tree\}\_\{\\rho\}\(s\)denote the module\-instance parse tree obtained from the syntactic nesting of module declarations and instance calls inssunder rootρ\\rho\.
The following core big\-step rules capture the relevant structural behavior:
Γ;ρ⊢modulem\{\}⇓\(\{m\},\{\(ρ,m\)\},∅\)\(S\-Prim\)\\frac\{\}\{\\Gamma;\\rho\\vdash\\texttt\{module \}m\\\{\\\}\\Downarrow\(\\\{m\\\},\\\{\(\\rho,m\)\\\},\\emptyset\)\}\\ \\textsc\{\(S\-Prim\)\}Γ;ρ⊢s⇓\(V,Eh,Ew\)M∈ΓΓ;ρ⊢instx:M;s⇓\(V∪\{x\},Eh∪\{\(ρ,x\)\},Ew\)\(S\-Inst\)\\frac\{\\Gamma;\\rho\\vdash s\\Downarrow\(V,E\_\{h\},E\_\{w\}\)\\quad M\\in\\Gamma\}\{\\Gamma;\\rho\\vdash\\texttt\{inst \}x\{:\}M;\\ s\\Downarrow\(V\\cup\\\{x\\\},E\_\{h\}\\cup\\\{\(\\rho,x\)\\\},E\_\{w\}\)\}\\ \\textsc\{\(S\-Inst\)\}Γ;ρ⊢s⇓\(V,Eh,Ew\)p,q∈𝖯𝗈𝗋𝗍𝗌Γ\(V\)Γ;ρ⊢connectp→q;s⇓\(V,Eh,Ew∪\{\(p,q\)\}\)\(S\-Conn\)\\frac\{\\Gamma;\\rho\\vdash s\\Downarrow\(V,E\_\{h\},E\_\{w\}\)\\quad p,q\\in\\mathsf\{Ports\}\_\{\\Gamma\}\(V\)\}\{\\Gamma;\\rho\\vdash\\texttt\{connect \}p\\to q;\\ s\\Downarrow\(V,E\_\{h\},E\_\{w\}\\cup\\\{\(p,q\)\\\}\)\}\\ \\textsc\{\(S\-Conn\)\}Γ;ρ⊢s1⇓\(V1,Eh1,Ew1\)Γ;ρ⊢s2⇓\(V2,Eh2,Ew2\)Γ;ρ⊢s1;s2⇓\(V1∪V2,Eh1∪Eh2,Ew1∪Ew2\)\(S\-Seq\)\\frac\{\\Gamma;\\rho\\vdash s\_\{1\}\\Downarrow\(V\_\{1\},E\_\{h1\},E\_\{w1\}\)\\quad\\Gamma;\\rho\\vdash s\_\{2\}\\Downarrow\(V\_\{2\},E\_\{h2\},E\_\{w2\}\)\}\{\\Gamma;\\rho\\vdash s\_\{1\};s\_\{2\}\\Downarrow\(V\_\{1\}\\cup V\_\{2\},E\_\{h1\}\\cup E\_\{h2\},E\_\{w1\}\\cup E\_\{w2\}\)\}\\ \\textsc\{\(S\-Seq\)\}
###### Theorem 1\(Structural Semantics Preservation\)\.
Ifssis well formed andΓ;ρ⊢s⇓H\\Gamma;\\rho\\vdash s\\Downarrow H, then𝖲𝗍𝗋𝗎𝖼𝗍\(H\)\\mathsf\{Struct\}\(H\)is isomorphic to𝖳𝗋𝖾𝖾ρ\(s\)\\mathsf\{Tree\}\_\{\\rho\}\(s\)\.
###### Proof\.
By induction on the derivation ofΓ;ρ⊢s⇓H\\Gamma;\\rho\\vdash s\\Downarrow H\.S\-Primadds one module vertex and one hierarchy edge fromρ\\rhoto that module, matching the corresponding leaf in𝖳𝗋𝖾𝖾ρ\(s\)\\mathsf\{Tree\}\_\{\\rho\}\(s\)\.S\-Instadds exactly one instance vertex and one hierarchy edge from the current parentρ\\rhoto the instancexx; by the induction hypothesis, the remaining phrasesspreserves its syntactic hierarchy, so the whole graph matches the parse tree after the instance expansion\.S\-Connonly updatesEwE\_\{w\}, and thus leaves𝖲𝗍𝗋𝗎𝖼𝗍\(H\)\\mathsf\{Struct\}\(H\)unchanged\.S\-Seqfollows from the induction hypotheses fors1s\_\{1\}ands2s\_\{2\}and from taking the union of their hierarchy vertices and edges under the same parentρ\\rho\. Therefore, for every well\-formed phrase, the hierarchy represented in the generated graph is the same as the hierarchy specified by the DSL syntax\. ∎
Code Generation\.[Fig\.6](https://arxiv.org/html/2605.17892#S3.F6)illustrates the compilation flow from theCPPLALUmodule to Verilog\. The frontend first parses the module signatures, structured docstring constructs, and hierarchy implied by formatted module calls\. It then emits the corresponding CIRCT IR and invokes CIRCT’s code\-generation pipeline to produce Verilog\. This flow separates the LLM\-facing specification from the backend representation: the LLM helps generate implementation details, whileCPPLand CIRCT enforce structural consistency and backend legality\.
fromcpplimportDesign
design=Design\(\)\#Createanewdesign
design\.add\(ALU\)\#Addthetopmoduletothedesign
design\.to\_verilog\(\)\#CompilethedesigntoVerilog
Figure 6:Code snippet usingCPPLto compile the ALU example into Verilog\.\[
\{
”name”:”Adder8”,
”ports”:\{
”a”:\{”dir”:”input”,”width”:8\},
”b”:\{”dir”:”input”,”width”:8\},
”sum”:\{”dir”:”output”,”width”:8\}
\},
”body”:\[
\{”id”:”sum\_val”,”op”:”add”,”args”:\[”a”,”b”\]\},
\{”op”:”output”,”args”:\{”sum”:”sum\_val”\}\}
\]
\},
\{
”name”:”ALU”,
”ports”:\{
”op\_code”:\{”dir”:”input”,”width”:2\},
”op\_a”:\{”dir”:”input”,”width”:8\},
”op\_b”:\{”dir”:”input”,”width”:8\},
”res”:\{”dir”:”output”,”width”:8\},
”zero”:\{”dir”:”output”,”width”:1\}
\},
”body”:\[
\{”id”:\[”adder8\_sum”\],”op”:”instance”,”module”:”Adder8”,”args”:\{”a”:”op\_a”,”b”:”op\_b”\}\},
\{”id”:”sel0”,”op”:”extract”,”args”:\[”op\_code”\],”lowBit”:0,”width”:1\},
\{”id”:”sel1”,”op”:”extract”,”args”:\[”op\_code”\],”lowBit”:1,”width”:1\},
\{”id”:”sub\_res”,”op”:”sub”,”args”:\[”op\_a”,”op\_b”\]\},
\{”id”:”and\_res”,”op”:”and”,”args”:\[”op\_a”,”op\_b”\]\},
\{”id”:”or\_res”,”op”:”or”,”args”:\[”op\_a”,”op\_b”\]\},
\{”id”:”mux\_lo”,”op”:”mux”,”args”:\[”sel0”,”sub\_res”,”adder8\_sum”\]\},
\{”id”:”mux\_hi”,”op”:”mux”,”args”:\[”sel0”,”or\_res”,”and\_res”\]\},
\{”id”:”res\_mux”,”op”:”mux”,”args”:\[”sel1”,”mux\_hi”,”mux\_lo”\]\},
\{”id”:”any\_set”,”op”:”or\_reduce”,”args”:\[”res\_mux”\]\},
\{”id”:”is\_zero”,”op”:”not”,”args”:\[”any\_set”\]\},
\{”op”:”output”,”args”:\{”res”:”res\_mux”,”zero”:”is\_zero”\}\}
\]
\}
\]
Figure 7:TheCPPL IRgenerated from theCPPLcode of the ALU example\.
### III\-BJSON\-based Intermediate Representation
As discussed in[SectionII\-D](https://arxiv.org/html/2605.17892#S2.SS4), directly generating CIRCT IR poses challenges for LLMs\. However, LLMs’ demonstrated ability to generate Verilog code suggests they possess some understanding of circuit structure and semantics\. We leverage this capability by introducing an LLM\-friendly intermediate representation that preserves compiler\-visible structure while avoiding raw MLIR syntax\. JSON is a widely adopted data format with strong LLM support, as evidenced by its use in LLM serving and application frameworks\[[31](https://arxiv.org/html/2605.17892#bib.bib31),[32](https://arxiv.org/html/2605.17892#bib.bib32),[33](https://arxiv.org/html/2605.17892#bib.bib33)\]\. To this end, we design a JSON\-based IR,CPPL IR, that captures essential structural and behavioral circuit information in a regular, statically checkable format\.
Design Principles\.CPPL IRis designed around three principles\. First, it separates the model\-facing representation from the backend IR syntax, so LLMs do not need to emit MLIR punctuation, SSA names, or dialect assembly forms directly\. Second, it preserves compiler\-relevant information: modules, ports, instances, operation identifiers, and operands are explicit fields rather than unstructured text\. Third, it defines a deterministic lowering contract to CIRCT, allowing the frontend to validate generated programs before Verilog emission\. This design enables early error detection and iterative refinement, while insulating the LLM from version\-specific CIRCT syntax changes\. To reduce type mismatch errors,CPPL IRis also designed with an inferable width system that can be statically checked during JSON\-to\-CIRCT compilation, as discussed in[SectionIII\-C](https://arxiv.org/html/2605.17892#S3.SS3)\.
[Fig\.7](https://arxiv.org/html/2605.17892#S3.F7)shows theCPPL IRgenerated from theCPPLdescription of the ALU example in[Fig\.5](https://arxiv.org/html/2605.17892#S3.F5)\. The function signature is translated into module declarations and port maps, corresponding to theCPPLcode structure\. Instance items are automatically inserted into the module body by theCPPLcompiler through parsing the docstring and formatted module calls prior to LLM generation, preserving module hierarchy and port bindings\. The remaining operations are then generated by the LLM based on the implementation intent described in the docstring\.
m∈𝖬𝗈𝖽𝖭𝖺𝗆𝖾p∈𝖯𝗈𝗋𝗍𝖭𝖺𝗆𝖾x∈𝖨𝖽o∈𝖮𝗉𝖼𝗈𝖽𝖾w,n∈ℕ\+\\begin\{array\}\[\]\{@\{\}llll@\{\}\}m\\in\\mathsf\{ModName\}&p\\in\\mathsf\{PortName\}&x\\in\\mathsf\{Id\}&o\\in\\mathsf\{Opcode\}\\end\{array\}\\quad w,n\\in\\mathbb\{N\}^\{\+\}
D::=\[M1,…,Mn\]M::=\{"name":m,"ports":P,"body":B\}P::=\{pi:Πi\}i=1nΠ::=\{"dir":δ,"width":w\}δ::="input"∣"output"B::=\[O1,…,On\]O::=\{"id":r,"op":o,"args":A,κ\}∣\{"id":\[x1,…,xn\],"op":"instance","module":m,"args":Γ,κ\}∣\{"op":"output","args":Γ\}r::=x∣\[x1,…,xn\]A::=\[x1,…,xn\]∣ΓΓ::=\{pi:xi\}i=1nκ::=ϵ∣a:v,κ\\begin\{array\}\[\]\{@\{\}rcl@\{\}\}D&::=&\[M\_\{1\},\\ldots,M\_\{n\}\]\\\\ M&::=&\\\{\\texttt\{"name"\}:m,\\ \\texttt\{"ports"\}:P,\\ \\texttt\{"body"\}:B\\\}\\\\ P&::=&\\\{p\_\{i\}:\\Pi\_\{i\}\\\}\_\{i=1\}^\{n\}\\\\ \\Pi&::=&\\\{\\texttt\{"dir"\}:\\delta,\\ \\texttt\{"width"\}:w\\\}\\\\ \\delta&::=&\\texttt\{"input"\}\\mid\\texttt\{"output"\}\\\\ B&::=&\[O\_\{1\},\\ldots,O\_\{n\}\]\\\\\[1\.22911pt\] O&::=&\\\{\\texttt\{"id"\}:r,\\ \\texttt\{"op"\}:o,\\ \\texttt\{"args"\}:A,\\ \\kappa\\\}\\\\ &\\mid&\\\{\\texttt\{"id"\}:\[x\_\{1\},\\ldots,x\_\{n\}\],\\ \\texttt\{"op"\}:\\texttt\{"instance"\},\\\\ &&\\qquad\\texttt\{"module"\}:m,\\ \\texttt\{"args"\}:\\Gamma,\\ \\kappa\\\}\\\\ &\\mid&\\\{\\texttt\{"op"\}:\\texttt\{"output"\},\\ \\texttt\{"args"\}:\\Gamma\\\}\\\\\[1\.22911pt\] r&::=&x\\mid\[x\_\{1\},\\ldots,x\_\{n\}\]\\\\ A&::=&\[x\_\{1\},\\ldots,x\_\{n\}\]\\mid\\Gamma\\\\ \\Gamma&::=&\\\{p\_\{i\}:x\_\{i\}\\\}\_\{i=1\}^\{n\}\\\\ \\kappa&::=&\\epsilon\\mid a:v,\\kappa\\end\{array\}
Figure 8:Formal syntax of the JSON\-based IR\.IR Semantics\. We formalize the syntax of theCPPL IRin[Fig\.8](https://arxiv.org/html/2605.17892#S3.F8)\. The grammar follows the concrete JSON structure: a design is a list of modules, each module contains a port map and a body of dictionary\-based operations\. The body of each module consists of operations and module instances, each identified by a unique identifier except for output operations\. This identifier scheme enforces a definition\-before\-use discipline, maintaining consistency with CIRCT IR semantics\. TheCPPLcompiler can also insert predefined items, such as module instances extracted from the frontend DSL, into the module body before LLM\-generated operations are compiled\. The syntax rules serve as the system prompt to guide LLMs in generatingCPPL IRthat can be correctly parsed and compiled to CIRCT IR\. TheCPPL IRused in theCPPLframework bridges LLM generation capabilities with CIRCT’s static compilation flow, enabling more reliable LLM\-assisted hardware design\.
Σ;Γ⊢𝖼𝗈𝗇𝗌𝗍\(x,w\)⇒Γ\[x↦iw\]T\-Const\\dfrac\{\}\{\\begin\{gathered\}\\Sigma;\\Gamma\\vdash\\mathsf\{const\}\(x,w\)\\\\ \\Rightarrow\\Gamma\[x\\mapsto i^\{w\}\]\\end\{gathered\}\}\\ \\textsc\{T\-Const\}Γ\(a\)=iwu∈𝖴𝗇𝖺𝗋𝗒Σ;Γ⊢𝗎𝗇𝖺𝗋𝗒u\(x,a\)⇒Γ\[x↦iw\]T\-Unary\\dfrac\{\\Gamma\(a\)=i^\{w\}\\qquad u\\in\\mathsf\{Unary\}\}\{\\begin\{gathered\}\\Sigma;\\Gamma\\vdash\\mathsf\{unary\}\_\{u\}\(x,a\)\\\\ \\Rightarrow\\Gamma\[x\\mapsto i^\{w\}\]\\end\{gathered\}\}\\ \\textsc\{T\-Unary\}Γ\(a\)=iwr∈𝖱𝖾𝖽𝗎𝖼𝖾Σ;Γ⊢𝗋𝖾𝖽𝗎𝖼𝖾r\(x,a\)⇒Γ\[x↦i1\]T\-Reduce\\dfrac\{\\Gamma\(a\)=i^\{w\}\\qquad r\\in\\mathsf\{Reduce\}\}\{\\begin\{gathered\}\\Sigma;\\Gamma\\vdash\\mathsf\{reduce\}\_\{r\}\(x,a\)\\\\ \\Rightarrow\\Gamma\[x\\mapsto i^\{1\}\]\\end\{gathered\}\}\\ \\textsc\{T\-Reduce\}Γ\(a1\)=iwΓ\(a2\)=iwb∈𝖡𝗂𝗇𝖺𝗋𝗒Σ;Γ⊢𝖻𝗂𝗇b\(x,a1,a2\)⇒Γ\[x↦iw\]T\-Bin\\dfrac\{\\Gamma\(a\_\{1\}\)=i^\{w\}\\qquad\\Gamma\(a\_\{2\}\)=i^\{w\}\\qquad b\\in\\mathsf\{Binary\}\}\{\\begin\{gathered\}\\Sigma;\\Gamma\\vdash\\mathsf\{bin\}\_\{b\}\(x,a\_\{1\},a\_\{2\}\)\\\\ \\Rightarrow\\Gamma\[x\\mapsto i^\{w\}\]\\end\{gathered\}\}\\ \\textsc\{T\-Bin\}Γ\(a1\)=iwΓ\(a2\)=iwc∈𝖢𝗆𝗉Σ;Γ⊢𝖼𝗆𝗉c\(x,a1,a2\)⇒Γ\[x↦i1\]T\-Cmp\\dfrac\{\\Gamma\(a\_\{1\}\)=i^\{w\}\\qquad\\Gamma\(a\_\{2\}\)=i^\{w\}\\qquad c\\in\\mathsf\{Cmp\}\}\{\\begin\{gathered\}\\Sigma;\\Gamma\\vdash\\mathsf\{cmp\}\_\{c\}\(x,a\_\{1\},a\_\{2\}\)\\\\ \\Rightarrow\\Gamma\[x\\mapsto i^\{1\}\]\\end\{gathered\}\}\\ \\textsc\{T\-Cmp\}Γ\(s\)=i1Γ\(t\)=iwΓ\(f\)=iwΣ;Γ⊢𝗆𝗎𝗑\(x,s,t,f\)⇒Γ\[x↦iw\]T\-Mux\\dfrac\{\\Gamma\(s\)=i^\{1\}\\qquad\\Gamma\(t\)=i^\{w\}\\qquad\\Gamma\(f\)=i^\{w\}\}\{\\begin\{gathered\}\\Sigma;\\Gamma\\vdash\\mathsf\{mux\}\(x,s,t,f\)\\\\ \\Rightarrow\\Gamma\[x\\mapsto i^\{w\}\]\\end\{gathered\}\}\\ \\textsc\{T\-Mux\}Γ\(a\)=iwsws≤wΣ;Γ⊢𝖼𝖺𝗌𝗍\(x,a,w\)⇒Γ\[x↦iw\]T\-Cast\\dfrac\{\\Gamma\(a\)=i^\{w\_\{s\}\}\\qquad w\_\{s\}\\leq w\}\{\\begin\{gathered\}\\Sigma;\\Gamma\\vdash\\mathsf\{cast\}\(x,a,w\)\\\\ \\Rightarrow\\Gamma\[x\\mapsto i^\{w\}\]\\end\{gathered\}\}\\ \\textsc\{T\-Cast\}Γ\(ai\)=iwi\(1≤i≤n\)Σ;Γ⊢𝖼𝗈𝗇𝖼𝖺𝗍\(x,a→\)⇒Γ\[x↦i∑i=1nwi\]T\-Concat\\dfrac\{\\Gamma\(a\_\{i\}\)=i^\{w\_\{i\}\}\\ \(1\\leq i\\leq n\)\}\{\\begin\{gathered\}\\Sigma;\\Gamma\\vdash\\mathsf\{concat\}\(x,\\vec\{a\}\)\\\\ \\Rightarrow\\Gamma\[x\\mapsto i^\{\\sum\_\{i=1\}^\{n\}w\_\{i\}\}\]\\end\{gathered\}\}\\ \\textsc\{T\-Concat\}Γ\(a\)=iwsl\+w≤wsΣ;Γ⊢𝖾𝗑𝗍𝗋𝖺𝖼𝗍\(x,a,l,w\)⇒Γ\[x↦iw\]T\-Extract\\dfrac\{\\Gamma\(a\)=i^\{w\_\{s\}\}\\qquad l\+w\\leq w\_\{s\}\}\{\\begin\{gathered\}\\Sigma;\\Gamma\\vdash\\mathsf\{extract\}\(x,a,l,w\)\\\\ \\Rightarrow\\Gamma\[x\\mapsto i^\{w\}\]\\end\{gathered\}\}\\ \\textsc\{T\-Extract\}Γ\(d\)=iwΓ\(clk\)=i1Γ\(en\)=i1Σ;Γ⊢𝗋𝖾𝗀\(x,d,clk,en,w\)⇒Γ\[x↦iw\]T\-Reg\\dfrac\{\\Gamma\(d\)=i^\{w\}\\qquad\\Gamma\(clk\)=i^\{1\}\\qquad\\Gamma\(en\)=i^\{1\}\}\{\\begin\{gathered\}\\Sigma;\\Gamma\\vdash\\mathsf\{reg\}\(x,d,clk,en,w\)\\\\ \\Rightarrow\\Gamma\[x\\mapsto i^\{w\}\]\\end\{gathered\}\}\\ \\textsc\{T\-Reg\}Σ\(m\)=\(\{pi:iwi\}i=1n,\{qj:ivj\}j=1k\)ρ\(pi\)=aiΓ\(ai\)=iwi\(1≤i≤n\)Σ;Γ⊢𝗂𝗇𝗌𝗍\(x→,m,ρ\)⇒Γ\[xj↦ivj\]j=1kT\-Inst\\dfrac\{\\begin\{gathered\}\\Sigma\(m\)=\(\\\{p\_\{i\}:i^\{w\_\{i\}\}\\\}\_\{i=1\}^\{n\},\\\{q\_\{j\}:i^\{v\_\{j\}\}\\\}\_\{j=1\}^\{k\}\)\\\\ \\rho\(p\_\{i\}\)=a\_\{i\}\\qquad\\Gamma\(a\_\{i\}\)=i^\{w\_\{i\}\}\\ \(1\\leq i\\leq n\)\\end\{gathered\}\}\{\\begin\{gathered\}\\Sigma;\\Gamma\\vdash\\mathsf\{inst\}\(\\vec\{x\},m,\\rho\)\\\\ \\Rightarrow\\Gamma\[x\_\{j\}\\mapsto i^\{v\_\{j\}\}\]\_\{j=1\}^\{k\}\\end\{gathered\}\}\\ \\textsc\{T\-Inst\}O\(pi\)=iwiρ\(pi\)=aiΓ\(ai\)=iwiΣ;Γ⊢𝗈𝗎𝗍\(ρ,O\)⇒ΓT\-Out\\dfrac\{O\(p\_\{i\}\)=i^\{w\_\{i\}\}\\qquad\\rho\(p\_\{i\}\)=a\_\{i\}\\qquad\\Gamma\(a\_\{i\}\)=i^\{w\_\{i\}\}\}\{\\begin\{gathered\}\\Sigma;\\Gamma\\vdash\\mathsf\{out\}\(\\rho,O\)\\\\ \\Rightarrow\\Gamma\\end\{gathered\}\}\\ \\textsc\{T\-Out\}Figure 9:Representative width inference rules forCPPL IR\.
### III\-CBackend Compilation
TheCPPLcompiler lowersCPPL IRinto CIRCT IR and invokes CIRCT’s code\-generation pipeline to emit Verilog\. During JSON\-to\-CIRCT compilation, two steps are central to reliability:type inferenceandcode refinement\.
Type Inference\.CPPL IRuses bitwidths as its type system, aligning with CIRCT IR’s type system wherehw\.integertypes represent unsigned integers of specific widths\. For a modulemm, its signature environmentΣ\(m\)=\(I,O\)\\Sigma\(m\)=\(I,O\)records the declared input and output port types\. Starting from the input\-port environmentΓ0=I\\Gamma\_\{0\}=I, the compiler infers each internal SSA value by propagating widths through operations\. We writeΣ;Γ⊢O⇒Γ′\\Sigma;\\Gamma\\vdash O\\Rightarrow\\Gamma^\{\\prime\}to denote that operationOOis well typed underΓ\\Gammaand extends it toΓ′\\Gamma^\{\\prime\}\.[Fig\.9](https://arxiv.org/html/2605.17892#S3.F9)summarizes the representative rules: arithmetic and logic operations preserve width, comparisons and reductions produce one\-bit values, muxes require equal\-width branches, and instances obtain result widths from callee signatures\. The implementation applies these rules to a fixpoint and then checks the inferred output values againstOO\. This rejects width errors such as unequal binary operands, non\-1\-bit mux selectors, invalid extracts, and output bindings whose inferred widths do not match the declared ports\.
module\{
hw\.module@Adder8\(inhw\.output\}
hw\.module@ALU\(inhw\.output\}
\}
Figure 10:The generated CIRCT IR of the ALU example compiled from the JSON\-based IR in[Fig\.7](https://arxiv.org/html/2605.17892#S3.F7)\.Code Refinement\. In addition to reporting type errors, theCPPLcompiler performs static analyses onCPPL IRto identify potential issues before CIRCT lowering\. The core analyses are listed as follows:
- •Syntax validation: TheCPPLcompiler validates the syntax of the generatedCPPL IRto ensure it conforms to the defined grammar and structure\. This includes verifying identifier symbol tables, terminator coverage, and instance graph connectivity\.
- •Dead code elimination: TheCPPLcompiler identifies and eliminates unused operations or module instances in the final design, reducing unnecessary IR before backend compilation\.
- •Combinational loop detection: TheCPPLcompiler detects combinational loops in the generatedCPPL IR, which would result in non\-synthesizable designs\.
Each identified issue is wrapped in a Python Exception with a descriptive error message, which is used to provide feedback to the LLM for iterative refinement of the generatedCPPL IR\. This refinement loop is shown in[Fig\.4](https://arxiv.org/html/2605.17892#S3.F4)\. The maximum number of iterations is denoted asNmaxN\_\{\\max\}, and the iteration process continues until a validCPPL IRis generated or the maximum number of iterations is reached\.
moduleALU\(
input\[1:0\]op\_code,
input\[7:0\]op\_a,
op\_b,
output\[7:0\]res,
outputzero
\);
\\parwire\[7:0\]\_Adder8\_0\_sum;
Adder8Adder8\_0\(
\.a\(op\_a\),
\.b\(op\_b\),
\.sum\(\_Adder8\_0\_sum\)
\);
wire\[7:0\]\_GEN=
op\_code\[1\]
?\(op\_code\[0\]?op\_a\|op\_b:op\_a&op\_b\)
:op\_code\[0\]?op\_a\-op\_b:\_Adder8\_0\_sum;
assignres=\_GEN;
assignzero=~\(\|\_GEN\);
endmodule
Figure 11:The generated Verilog code of the ALU example compiled from the CIRCT IR in[Fig\.10](https://arxiv.org/html/2605.17892#S3.F10)\.
### III\-DCode Generation
[Fig\.10](https://arxiv.org/html/2605.17892#S3.F10)shows the CIRCT IR generated for the ALU example, compiled from theCPPL IRin[Fig\.7](https://arxiv.org/html/2605.17892#S3.F7)\. Module declarations and port mappings are directly translated from theCPPL IR, while operations are generated by the LLM based on the implementation intent described in theCPPLcode’s docstring\. The generated CIRCT IR is then passed through CIRCT’s code generation pipeline to emit the final Verilog code shown in[Fig\.11](https://arxiv.org/html/2605.17892#S3.F11)\. The resulting Verilog code is syntactically valid and accepted by backend EDA tools in our evaluation\.
[Fig\.12](https://arxiv.org/html/2605.17892#S3.F12)illustrates the compilation pipeline for Verilog code generation\. The highlighted lines show the optimization passes applied before Verilog emission, which include the standard optimizations discussed in[SectionII\-C](https://arxiv.org/html/2605.17892#S2.SS3)\. CIRCT’s code generation pipeline applies standard canonicalization and optimization passes, thereby improving the quality of the emitted Verilog without requiring the LLM to implement these transformations explicitly\.
Overall,CPPLprovides a structured and modular approach to LLM\-assisted hardware generation, combining LLM generation with compiler\-checked legality and backend optimization\.
pipeline=\[
”hw\.module\(lower\-seq\-hlmem\)”,
”lower\-seq\-to\-sv”,
”canonicalize”,
”cse”,
”hw\.module\(prettify\-verilog\)”,
”hw\.module\(hw\-cleanup\)”,
\]
pm=passmanager\.PassManager\.parse\(”builtin\.module\(”\+”,”\.join\(pipeline\)\+”\)”\)
pm\.run\(mod\.operation\)
Figure 12:The CIRCT pass pipeline used in theCPPLcompiler to generate Verilog code from the generated CIRCT IR\.
## IVEvaluation
TABLE I:Syntax and functional correctness of direct SystemVerilog generation on RTLLM\.ModelSyntaxFunctionalitypass@1pass@2pass@5pass@1pass@2pass@5Claude\-opus\-4\.60\.9320\.9320\.9560\.7250\.7610\.778GPT\-5\.3\-codex0\.9570\.9630\.9640\.6750\.7100\.752Gemini\-3\.1\-pro0\.6390\.7090\.7590\.5540\.6030\.642Kimi\-k2\.50\.9250\.9630\.9640\.7040\.7540\.782Qwen\-3\.6\-plus0\.9180\.9430\.9610\.6790\.7370\.798Minimax\-2\.50\.6890\.7990\.9120\.5430\.6310\.748
TABLE II:Syntax and functional correctness of direct CIRCT IR generation on RTLLM\.ModelSyntaxFunctionalitypass@1pass@2pass@5pass@1pass@2pass@5Claude\-opus\-4\.60\.6610\.7900\.8840\.5000\.5980\.693GPT\-5\.3\-codex0\.4250\.5420\.6790\.3180\.3940\.471Gemini\-3\.1\-pro0\.2570\.3160\.3810\.2390\.2850\.328Kimi\-k2\.50\.3860\.4760\.5800\.2640\.3250\.384Qwen\-3\.6\-plus0\.0930\.1370\.2080\.0820\.1170\.168Minimax\-2\.50\.1540\.2330\.4030\.0890\.1130\.171
TABLE III:Functional correctness of generated Verilog code from CPPL implementations on different models\.ModelFunctionalitypass@1pass@2pass@5CPPL \(Claude\-opus\-4\.6\)0\.8000\.8170\.838CPPL \(GPT\-5\.3\-codex\)0\.7820\.8400\.874CPPL \(Qwen\-3\.6\-plus\)0\.7680\.8120\.821
TABLE IV:Average AIG node counts of CPPL\-generated Verilog \(Claude\-opus\-4\.6\) vs\. RTLLM reference designs, with and without CIRCT optimizations \([Fig\.12](https://arxiv.org/html/2605.17892#S3.F12)\)\.DesignRefVerilogw/o OptOptram7004147458\.67413accu318309290290adder\_16bit608368386352adder\_32bit19821038\.11045784adder\_8bit160200184168adder\_pipe\_64bit1892189219091892alu101596571\.758642\.678413calendar352352295\.78229div\_16bit16397660359453945edge\_detect14777freq\_div207170\.1140125fsm13091\.8927958\.12multi\_16bit24092644\.921462146multi\_pipe\_4bit345345333333multi\_pipe\_8bit193618771840\.711834\.6parallel2serial63634949radix2\_div941–18631863serial2parallel127–8181signal\_generator154–221125traffic\_light493–437425\.4Geo\. Avg513\.89525\.09439\.88368\.16
*Note:*Lower is better\.RTLLM reference; among generated results,best in each row,/increasingly larger node counts, andmore than1\.5×1\.5\\timesthe row best;unavailable\.
### IV\-AExperiment Setup
We evaluate the performance ofCPPLon the RTLLM benchmark\[[5](https://arxiv.org/html/2605.17892#bib.bib5)\], which consists of 29 problems covering a wide range of hardware design tasks, including combinational logic, sequential logic, and memory design\. The LLMs evaluated in our experiments include Claude\-opus\-4\.6, GPT\-5\.3\-codex, Gemini\-3\.1\-pro, Kimi\-k2\.5, Qwen\-3\.6\-plus, and Minimax\-2\.5\. These models are selected based on their strong performance in code generation tasks and their wide adoption in recent research\.
We employ thepass@1metric\[[4](https://arxiv.org/html/2605.17892#bib.bib4),[34](https://arxiv.org/html/2605.17892#bib.bib34)\]used in LLM code generation evaluation tasks to assess the performance of LLMs in generating correct CIRCT IR and Verilog code, which is defined as follows:
pass@k:=Eproblems\[1−\(n−ck\)\(nk\)\],pass@k:=\\text\{E\}\_\{\\text\{problems\}\}\\left\[1\-\\frac\{\\binom\{n\-c\}\{k\}\}\{\\binom\{n\}\{k\}\}\\right\],\(1\)wheren≥kn\\geq ksamples are generated per problem and a problem is solved if any of thekksamples passes the unit tests\. In our experiments, we samplen=10n=10code completions per problem for each downstream task and measurepass@kwithk=1,2,5k=1,2,5\. For all evaluation tasks, we set the model temperature to0\.10\.1to balance diversity and correctness in generated code\. The maximum token length is set to40964096, covering reasoning traces, regenerations, and final code generation\. For theCPPLrefinement loop, we set the maximum number of refinement attempts toNmax=3N\_\{\\max\}=3\.
We evaluate the main stages of theCPPLdesign flow, including syntax correctness, functionality verification, and backend performance evaluation\. Syntax correctness and functionality verification are assessed usingiverilog\[[35](https://arxiv.org/html/2605.17892#bib.bib35)\]with the simulation scripts and unit tests from the RTLLM benchmark\. Backend performance evaluation is conducted using theYosyssynthesis tool\[[36](https://arxiv.org/html/2605.17892#bib.bib36)\]to measure resource utilization of the generated RTL code\. The prompts used for generating CIRCT IR, Verilog code, andCPPLimplementations share the same design description, differing only in the instruction specifying the target output format\. For CIRCT IR generation, we include additional system prompts instructing the LLM to adhere to the CIRCT IR format and semantics compatible with our experimental framework\. We use direct Verilog and direct CIRCT IR generation as the primary baselines to isolate the effect of replacing unconstrained text generation with a compiler\-checkable frontend representation\. Existing LLM4RTL systems\[[6](https://arxiv.org/html/2605.17892#bib.bib6),[7](https://arxiv.org/html/2605.17892#bib.bib7),[8](https://arxiv.org/html/2605.17892#bib.bib8),[10](https://arxiv.org/html/2605.17892#bib.bib10),[11](https://arxiv.org/html/2605.17892#bib.bib11),[12](https://arxiv.org/html/2605.17892#bib.bib12),[14](https://arxiv.org/html/2605.17892#bib.bib14)\]primarily target final RTL and often incorporate task\-specific prompting, fine\-tuning, retrieval, or repair strategies; they do not evaluate LLM generation of CIRCT IR or expose a comparable compiler\-IR path\. Therefore, a direct end\-to\-end comparison with such systems would conflate our representation and compiler design with orthogonal model\- and workflow\-level techniques\.CPPLis complementary to these techniques: the same prompting, retrieval, or repair strategies could be used to generateCPPL IRinstead of raw RTL\.
### IV\-BPerformance on Verilog and CIRCT IR Generation
[TableII](https://arxiv.org/html/2605.17892#S4.T2)and[TableII](https://arxiv.org/html/2605.17892#S4.T2)present the syntax and functional correctness of direct SystemVerilog and CIRCT IR generation on the RTLLM benchmark, respectively\.
Direct Verilog generation achieves high syntax correctness but still leaves a large gap to functional correctness\. Four models exceed 0\.9 syntaxpass@1, with GPT\-5\.3\-codex reaching 0\.957, yet the best functionalpass@1is only 0\.725 from Claude\-opus\-4\.6\. This indicates that syntactically valid RTL does not necessarily implement the intended behavior, and that syntax checking alone is insufficient as a design\-flow interface\.
Direct CIRCT IR generation is substantially harder\. Even with format\-specific system prompts, the syntaxpass@1drops sharply for most models; for example, Qwen\-3\.6\-plus and Minimax\-2\.5 achieve only 0\.093 and 0\.154, respectively\. Functional correctness is also limited, with all models except Claude\-opus\-4\.6 staying below 0\.5\. These results motivate an intermediate frontend representation that preserves the benefits of compiler\-based hardware generation without requiring LLMs to emit raw CIRCT IR\.
### IV\-CPerformance of CPPL
To evaluate the effectiveness ofCPPL, we conduct experiments on selected models that span different direct CIRCT IR generation capabilities: Claude\-opus\-4\.6, GPT\-5\.3\-codex, and Qwen\-3\.6\-plus\.[TableIII](https://arxiv.org/html/2605.17892#S4.T3)presents the functional correctness of generated Verilog code fromCPPLimplementations on different models\. All evaluated models produce syntactically valid CIRCT IR and Verilog through theCPPLcompilation flow for all RTLLM designs in our runs\. This result indicates that the JSON\-basedCPPL IRand compiler checks avoid a major source of raw CIRCT IR generation failures\. The generated Verilog also achieves higher functional correctness than both direct Verilog generation and direct CIRCT IR generation across the evaluated models\. Notably, Qwen\-3\.6\-plus, which performs poorly in direct CIRCT IR generation, reaches apass@1score of 0\.768 withCPPL, compared with 0\.082 for direct CIRCT IR functional correctness\. These results show thatCPPLcan expose compiler\-backed hardware generation to models that are otherwise unreliable at emitting raw CIRCT IR\.
### IV\-DSynthesis Node Reduction
To evaluate the backend performance ofCPPL\-generated Verilog code, we conduct synthesis experiments using Yosys to measure AIG node counts\. Each design is processed byread\_verilog \-sv,hierarchy \-top,synth \-top \-noabc,aigmap, andstat \-json; we report the post\-aigmapcell count\. Similar synthesis\-level proxies have also been used in prior works\[[7](https://arxiv.org/html/2605.17892#bib.bib7),[13](https://arxiv.org/html/2605.17892#bib.bib13),[37](https://arxiv.org/html/2605.17892#bib.bib37)\]\. To isolate the impact of compiler optimizations, we perform an ablation study comparing designs generated with and without CIRCT optimization passes in the compilation pipeline\. As shown in[TableIV](https://arxiv.org/html/2605.17892#S4.T4),CPPL\-generated Verilog often yields lower post\-aigmapnode counts than direct LLM\-generated Verilog and the RTLLM reference designs\. The ablation further shows that enabling CIRCT optimization passes reduces the geometric average node count from 439\.88 to 368\.16, a 16\.3% reduction\. These results support the central design choice ofCPPL: once LLM output is represented as a compiler\-checkable circuit IR, conventional hardware compiler optimizations can improve the emitted RTL without requiring the model to implement those transformations directly\.
## VConclusion
In this paper, we presentedCPPL, a compiler\-mediated framework for LLM\-assisted hardware generation\.CPPLcombines a Python frontend DSL, a statically checkable JSON\-based circuit IR, and deterministic lowering to CIRCT, allowing LLMs to generate hardware through a structured representation rather than raw RTL or raw compiler IR\. By recovering widths from module ports, validating generated operations, and applying CIRCT backend optimizations,CPPLexposes compiler checks and transformations to the LLM generation flow\. Our evaluation on RTLLM shows that this design improves functional correctness over direct Verilog and CIRCT IR generation and reduces post\-synthesis AIG node counts through compiler optimization\. These results point to compiler\-mediated generation as a promising direction for reliable LLM hardware design that can benefit from backend optimization\.
## References
- \[1\]A\. Yang, A\. Li, B\. Yang, B\. Zhang, B\. Hui, B\. Zheng, B\. Yu, C\. Gao, C\. Huang, C\. Lv*et al\.*, “Qwen3 technical report,”*arXiv preprint*, 2025\.
- \[2\]A\. Liu, B\. Feng, B\. Xue, B\. Wang, B\. Wu, C\. Lu, C\. Zhao, C\. Deng, C\. Zhang, C\. Ruan*et al\.*, “Deepseek\-v3 technical report,”*arXiv preprint*, 2024\.
- \[3\]J\. Achiam, S\. Adler, S\. Agarwal, L\. Ahmad, I\. Akkaya, F\. L\. Aleman, D\. Almeida, J\. Altenschmidt, S\. Altman, S\. Anadkat*et al\.*, “Gpt\-4 technical report,”*arXiv preprint*, 2023\.
- \[4\]M\. Liu, N\. Pinckney, B\. Khailany, and H\. Ren, “VerilogEval: Evaluating Large Language Models for Verilog Code Generation,” in*Proc\. ICCAD*, 2023\.
- \[5\]Y\. Lu, S\. Liu, Q\. Zhang, and Z\. Xie, “RTLLM: An Open\-source Benchmark for Designing RTL Generation with Large Language Model,” in*Proc\. ASPDAC*, 2024\.
- \[6\]F\. Cui, C\. Yin, K\. Zhou, Y\. Xiao, G\. Sun, Q\. Xu, Q\. Guo, Y\. Liang, X\. Zhang, D\. Song*et al\.*, “Origen: Enhancing RTL Code Generation with Code\-to\-Code Augmentation and Self\-Reflection,” in*Proc\. ICCAD*, 2024\.
- \[7\]Z\. Pei, H\.\-L\. Zhen, M\. Yuan, Y\. Huang, and B\. Yu, “BetterV: Controlled Verilog Generation with Discriminative Guidance,”*Proc\. ICML*, 2024\.
- \[8\]S\. Liu, W\. Fang, Y\. Lu, J\. Wang, Q\. Zhang, H\. Zhang, and Z\. Xie, “RTLCoder: Fully Open\-Source and Efficient LLM\-Assisted RTL Code Generation Technique,”*IEEE TCAD*, 2024\.
- \[9\]“CIRCT: Circuit IR Compilers and Tools,”[https://circt\.llvm\.org/](https://circt.llvm.org/), 2026\.
- \[10\]M\. Akyash, K\. Azar, and H\. Kamali, “RTL\+\+: Graph\-Enhanced LLM for RTL Code Generation,” in*Proc\. ICLAD*, 2025\.
- \[11\]Y\. Zhao, H\. Zhang, H\. Huang, Z\. Yu, and J\. Zhao, “MAGE: A Multi\-Agent Engine for Automated RTL Code Generation,” in*Proc\. DAC*, 2025\.
- \[12\]Z\. Yu, M\. Liu, M\. Zimmer, Y\. Celine, Y\. Liu, and H\. Ren, “Spec2RTL\-Agent: Automated Hardware Code Generation from Complex Specifications Using LLM Agent Systems,” in*Proc\. ICLAD*, 2025\.
- \[13\]Y\. Wang, W\. Ye, P\. Guo, Y\. He, Z\. Wang, B\. Tian, S\. He, G\. Sun, Z\. Shen, S\. Chen*et al\.*, “SymRTLO: Enhancing RTL Code Optimization with LLMs and Neuron\-Inspired Symbolic Reasoning,”*Proc\. NIPS*, 2026\.
- \[14\]X\. Yao, Y\. Wang, X\. Li, Y\. Lian, R\. Chen, L\. Chen, M\. Yuan, H\. Xu, and B\. Yu, “RTLRewriter: Methodologies for Large Models Aided RTL Code Optimization,” in*Proc\. ICCAD*, 2024\.
- \[15\]N\. Zhang, C\. Deng, J\. M\. Kuehn, C\.\-T\. Ho, C\. Yu, Z\. Zhang, and H\. Ren, “ASPEN: LLM\-Guided E\-Graph Rewriting for RTL Datapath Optimization,” in*Proc\. MLCAD*, 2025\.
- \[16\]Y\. Bai and H\. Ren, “Learning to Debug: LLM\-Organized Knowledge Trees for Solving RTL Assertion Failures,”*arXiv preprint*, 2025\.
- \[17\]J\. Wang, S\. Liu, Y\. Lu, and Z\. Xie, “HLSDebugger: Identification and Correction of Logic Bugs in HLS Code with LLM Solutions,” in*Proc\. ICCAD*, 2025\.
- \[18\]Z\. Yan, W\. Fang, M\. Li, M\. Li, S\. Liu, Z\. Xie, and H\. Zhang, “Assertllm: Generating hardware verification assertions from design specifications via multi\-llms,” in*Proc\. ASPDAC*, 2025\.
- \[19\]C\. Lattner, M\. Amini, U\. Bondhugula, A\. Cohen, A\. Davis, J\. Pienaar, R\. Riddle, T\. Shpeisman, N\. Vasilache, and O\. Zinenko, “MLIR: Scaling compiler infrastructure for domain specific computation,” in*Proc\. CGO*, 2021\.
- \[20\]J\. Weng, B\. Han, D\. Gao, R\. Gao, W\. Zhang, A\. Zhong, C\. Xu, J\. Xin, Y\. Luo, L\. W\. Wills*et al\.*, “Assassyn: A Unified Abstraction for Architectural Simulation and Implementation,” in*Proc\. ISCA*, 2025\.
- \[21\]Y\. Xiao, Z\. Luo, K\. Zhou, and Y\. Liang, “Cement: Streamlining FPGA Hardware Design with Cycle\-Deterministic EHDL and Synthesis,” in*Proc\. FPGA*, 2024\.
- \[22\]J\. Bachrach, H\. Vo, B\. Richards, Y\. Lee, A\. Waterman, R\. Avižienis, J\. Wawrzynek, and K\. Asanović, “Chisel: constructing hardware in a scala embedded language,” in*Proc\. DAC*, 2012\.
- \[23\]R\. Nigam, S\. Thomas, Z\. Li, and A\. Sampson, “A Compiler Infrastructure For Accelerator Generators,” in*Proc\. ASPLOS*, 2021\.
- \[24\]S\. Yin, F\. Liu, L\. Zou, R\. Fu, W\. Zhao, C\. Bai, T\.\-Y\. Ho, Y\. Xie, and B\. Yu, “PipeRTL: Timing\-Aware Pipeline Optimization at IR\-Level for RTL Generation,”*arXiv preprint*, 2026\.
- \[25\]H\. Zheng, Z\. He, S\. Yin, Y\. Ma, and B\. Yu, “CombRewriter: Enabling Combinational Logic Simplification in MLIR\-Based Hardware Compiler,” in*Proc\. ASPDAC*, 2026\.
- \[26\]F\. Schuiki, A\. Kurth, T\. Grosser, and L\. Benini, “LLHD: A Multi\-level Intermediate Representation For Hardware Description Languages,” in*Proc\. PLDI*, 2020\.
- \[27\]K\. Zhou, Y\. Liang, Y\. Lin, R\. Wang, and R\. Huang, “Khronos: Fusing Memory Access for Improved Hardware RTL Simulation,” in*Proc\. MICRO*, 2023\.
- \[28\]C\. Lattner and V\. Adve, “LLVM: A compilation framework for lifelong program analysis & transformation,” in*Proc\. CGO*, 2004\.
- \[29\]H\. Jiang, J\. Zhu, Y\. Wan, B\. Fang, H\. Zhang, R\. Jin, and Q\. Guan, “Can Large Language Models Understand Intermediate Representations in Compilers?” in*Proc\. ICML*, 2025\.
- \[30\]H\. Dong, Q\. Su, Y\. Gao, Z\. Li, Y\. Ruan, G\. Pekhimenko, C\. J\. Maddison, and X\. Si, “Appl: A prompt programming language for harmonious integration of programs and large language model prompts,” in*Proc\. ACL*, 2025\.
- \[31\]L\. Zheng, L\. Yin, Z\. Xie, C\. Sun, J\. Huang, C\. H\. Yu, S\. Cao, C\. Kozyrakis, I\. Stoica, J\. E\. Gonzalez*et al\.*, “Sglang: Efficient execution of structured language model programs,”*Proc\. NIPS*, 2024\.
- \[32\]W\. Kwon, Z\. Li, S\. Zhuang, Y\. Sheng, L\. Zheng, C\. H\. Yu, J\. Gonzalez, H\. Zhang, and I\. Stoica, “Efficient memory management for large language model serving with pagedattention,” in*Proc\. SOSP*, 2023\.
- \[33\]“LlamaIndex,”[https://github\.com/jerryjliu/llama\_index](https://github.com/jerryjliu/llama_index), 2022\.
- \[34\]M\. Chen, J\. Tworek, H\. Jun, Q\. Yuan, H\. P\. D\. O\. Pinto, J\. Kaplan, H\. Edwards, Y\. Burda, N\. Joseph, G\. Brockman*et al\.*, “Evaluating large language models trained on code,”*arXiv preprint*, 2021\.
- \[35\]S\. Williams, “The ICARUS Verilog Compilation System,”[https://steveicarus\.github\.io/iverilog/](https://steveicarus.github.io/iverilog/), 2002\.
- \[36\]C\. Wolf, J\. Glaser, and J\. Kepler, “Yosys\-a free verilog synthesis suite,”[https://yosyshq\.net/yosys/](https://yosyshq.net/yosys/), 2013\.
- \[37\]X\. Li, X\. Li, L\. Chen, X\. Zhang, M\. Yuan, and J\. Wang, “Logic synthesis with generative deep neural networks,”*arXiv preprint*, 2024\.Similar Articles
🚀 Prompt Logic Gates (PLG): Are Prompts Becoming Systems?
Prompt Logic Gates (PLG) is a visual prompt engineering experiment that organizes prompts using semantic logic gates (AND, OR, NOT, Ask Questions) to manage complex system-like prompts, aiming to improve maintainability and consistency.
Prune, Interpret, Evaluate: A Cross-Layer Transcoder-Native Framework for Efficient Circuit Discovery via Feature Attribution
Researchers introduce PIE, a CLT-native framework for efficient circuit discovery via feature attribution-based pruning, achieving ~40× compression in feature selection while maintaining behavioral fidelity on IOI and Doc-String tasks.
Making LLM context assembly programmable
RAMPART is a Python library that makes LLM context assembly programmable, allowing developers to register named blocks of context for placement before the model's first token. It improves performance by tens of percentage points on various models through block clustering and tool access control.
Nanopass Framework: Clean Compiler Creation Language
Nanopass Framework is a domain-specific language embedded in Scheme for creating compilers through small passes and intermediate representations, reducing boilerplate and improving maintainability.
Why do we have visual programming for code, but not for prompts?
The article proposes treating AI prompts as executable logic using visual nodes and logic gates, similar to visual programming languages, and introduces a prototype called Prompt Logic Gates (PLG).