PRISMat: Policy-Driven, Permutation-Invariant Autoregressive Material Generation

arXiv cs.AI 05/19/26, 04:00 AM Papers
Summary
PRISMat is a cost-effective, permutation-invariant autoregressive model for generating crystal slabs conditioned on surface properties, achieving 4× lower error than previous models while being more efficient than LLMs.
arXiv:2605.16612v1 Announce Type: new Abstract: Rapid identification of candidate materials with target properties has become a key task in materials science. Machine learning has emerged as an alternative to physics-based simulation, offering a faster and cheaper way to filter materials based on their stability and other target properties, reducing the number of candidates that reach the costly synthesis stage. Recently, Large Language Models (LLMs) have been applied to this role, but these models are parameter-heavy and computationally expensive both during training and at inference time, making them unsuitable for high-throughput tasks. This inefficiency stems from both the large over-parameterization of language models and the difficulty of framing material generation as a sequence learning problem. In this paper, we present PRISMat, a cost-effective, permutation-invariant model, which addresses these limitations. We show that PRISMat, despite taking less time for inference, is able to outperform LLMs in generating crystal slabs conditioned on critical materials' surface properties. In targeted material discovery, we achieve mean absolute errors of 0.188 eV/A$^2$ and 2.79 eV for cleavage energy and work function tasks, respectively, reducing the error of the next best model by 4$\times$.
Original Article
View Cached Full Text
Cached at: 05/19/26, 06:34 AM
# PRISMat: Policy-Driven, Permutation-Invariant Autoregressive Material Generation
Source: [https://arxiv.org/html/2605.16612](https://arxiv.org/html/2605.16612)
Claire Schlesinger Khoury College of Computer Sciences Northeastern University Boston, MA 02115 schlesinger\.e@northeastern\.edu &Circe Hsu Khoury College of Computer Sciences Northeastern University Boston, MA 02115 hsu\.circe@northeastern\.edu Peter Schindler College of Engineering Northeastern University Boston, MA 02115 p\.schindler@northeastern\.edu &Robin Walters Khoury College of Computer Sciences Northeastern University Boston, MA 02115 r\.walters@northeastern\.edu

###### Abstract

Rapid identification of candidate materials with target properties has become a key task in materials science\. Machine learning has emerged as an alternative to physics\-based simulation, offering a faster and cheaper way to filter materials based on their stability and other target properties, reducing the number of candidates that reach the costly synthesis stage\. Recently, Large Language Models \(LLMs\) have been applied to this role, but these models are parameter\-heavy and computationally expensive both during training and at inference time, making them unsuitable for high\-throughput tasks\. This inefficiency stems from both the large over\-parameterization of language models and the difficulty of framing material generation as a sequence learning problem\. In this paper, we present PRISMat, a cost\-effective, permutation\-invariant model, which addresses these limitations\. We show that PRISMat, despite taking less time for inference, is able to outperform LLMs in generating crystal slabs conditioned on critical materials’ surface properties\. In targeted material discovery, we achieve mean absolute errors of 0\.188 eV/Å2and 2\.79 eV for cleavage energy and work function tasks, respectively, reducing the error of the next best model by 4×\\times\.

## 1Introduction

![Refer to caption](https://arxiv.org/html/2605.16612v1/x1.png)Figure 1:Pareto frontier of time it takes to generate a single crystal versus the negative rate of metastable, unique, and novel \(MSUN\) crystals\. Orange squares indicate autoregressive techniques, while blue circles indicate pure diffusion techniques\. PRISMat performs on par with the best diffusion models and clearly outperforms other autoregressive LLM systems, showing that it is the most efficient autoregressive system\.Identifying novel materials is slow and expensive due to the vast search space and the difficulty of synthesizing and evaluating candidates\. The challenge is even greater when trying to identify candidates with specific desirable properties\. To accelerate this process, machine learning systems have been used to both rapidly evaluate the stability and properties of candidates and to discover new materials\[[31](https://arxiv.org/html/2605.16612#bib.bib15),[30](https://arxiv.org/html/2605.16612#bib.bib16),[37](https://arxiv.org/html/2605.16612#bib.bib26),[10](https://arxiv.org/html/2605.16612#bib.bib32),[6](https://arxiv.org/html/2605.16612#bib.bib33),[20](https://arxiv.org/html/2605.16612#bib.bib34)\]\. For example, several methods have been proposed to generate candidate materials with a target band gap or space group\[[42](https://arxiv.org/html/2605.16612#bib.bib8),[4](https://arxiv.org/html/2605.16612#bib.bib40),[7](https://arxiv.org/html/2605.16612#bib.bib9),[19](https://arxiv.org/html/2605.16612#bib.bib5)\]\. However, these methods generate only bulk crystals, meaning idealized, defect\-free structures that repeat infinitely in all three dimensions\. In reality, all crystals are finite and therefore bounded by surfaces, which for many technologically relevant applications are the primary determinants of material properties\. These properties are crucial for electron emission devices and heterogeneous catalysis\[[24](https://arxiv.org/html/2605.16612#bib.bib43),[2](https://arxiv.org/html/2605.16612#bib.bib44),[9](https://arxiv.org/html/2605.16612#bib.bib45),[43](https://arxiv.org/html/2605.16612#bib.bib46),[29](https://arxiv.org/html/2605.16612#bib.bib47)\]\. They also determine contact barriers at semiconductor interfaces and yield the approximate shape of nanoparticles\[[12](https://arxiv.org/html/2605.16612#bib.bib48),[5](https://arxiv.org/html/2605.16612#bib.bib49),[38](https://arxiv.org/html/2605.16612#bib.bib50),[27](https://arxiv.org/html/2605.16612#bib.bib51)\]\.

Beyond their inability to capture real\-world structural complexity, existing generative methods carry a further practical limitation: computational cost, with LLM\-based approaches in particular requiring prohibitively long inference times for high\-throughput screening\. Typically, LLM\-based methods generate materials in one shot, often directly outputting a Crystallographic Information File \(CIF\) or other crystal encoding to represent the structure\. This method suffers from heavy redundancy: by reordering, many CIF files encode the same crystal\. This dramatically enlarges the output space, making the distribution of ideal structures more difficult to learn\. Current methods attempt to address this limitation by imposing a canonical order in the CIF or using extensive data augmentation during training\[[7](https://arxiv.org/html/2605.16612#bib.bib9),[14](https://arxiv.org/html/2605.16612#bib.bib12)\]\. Enforcing an ordering to atoms through canonicalization does not consider any underlying physics and can lead to instabilities\[[26](https://arxiv.org/html/2605.16612#bib.bib52)\]\. Similarly, permuting atom order as an augmentation may encourage the model to learn permutation invariance, but it significantly increases training time without fully resolving the problem\. Despite these drawbacks, autoregressive systems are still an attractive choice for policy\-guided generation, as they can correct or reject certain crystals that do not fall within desired generation guidelines\.

We propose PRISMat \(PeRmutation\-Invariant Sequential Material generation\), a system which combines autoregressive generation with flow matching to generate novel materials\. This allows us to take advantage of the controllability of policy\-driven autoregressive generation with the speedups provided by flow matching\. PRISMat addresses the issues with autoregressive generation by representing crystals in a more efficient format and by being trained in a permutation\-invariant manner\. PRISMat is composed of three parts: 1\) A Gaussian mixture model which predicts lattice parameters, 2\) AnE\(3\)E\(3\)equivariant graph neural network \(GNN\) which autoregressively predicts atoms, and 3\) AnE\(3\)E\(3\)equivariant flow matching model to assign positions to each atom\[[25](https://arxiv.org/html/2605.16612#bib.bib22),[32](https://arxiv.org/html/2605.16612#bib.bib21)\]\. This three\-part setup enables interventions after each step, so we can customize the generation process\.

PRISMat achieves permutation invariance by reinterpreting the conditional output distribution of the autoregressive model as the cumulative distribution of all atom types remaining in the crystal instead of the probability distribution of the immediate next token\. While next token prediction is well\-suited for text generation, where order is critical, it is actually a hindrance in material generation, where no physically meaningful order exists\. By reinterpreting the output distribution, our method allows the use of the same underlying architectures used for text generation while enforcing permutation invariance appropriate to the setting\.

Our contributions are:

- •We introduce PRISMat, the first autoregressive, permutation invariant framework for crystal generation\.
- •We analyze PRISMat’s performance to identify the benefit of permutation invariance and policy\-guided generation as well as optimizing the sampling parameters for PRISMat\.
- •We evaluate our method’s ability to do conditional material generation on cleavage energy and work functions using the crystal slab dataset fromSchindleret al\.\[[33](https://arxiv.org/html/2605.16612#bib.bib30)\]and show it outperforms other autoregressive, conditioned LLM techniques by quartering the overall error on conditioned generation and having the lowest time per generated metastable, unique, and novel \(MSUN\) structure of any autoregressive model\.

## 2Related Work

#### Autoregressive Graph Generation

Permutation invariance has long been a challenge in autoregressive graph generation\. G\-SchNet\[[13](https://arxiv.org/html/2605.16612#bib.bib1)\]is an autoregressive method for the generation of rotationally invariant point graphs used for molecule generation\. G\-SchNet handles permutation invariance by relying on the structure of molecules by selecting an atom to focus on and then predicting an atom to bond to that atom\. GraphRNN builds a graph by representing it as a unique sequence and predicting those sequences to generate graphs\[[41](https://arxiv.org/html/2605.16612#bib.bib2)\]\. GraphRNN uses a BFS node ordering scheme to reduce the complexity due to the large number of possible node orderings\. GCPN autoregressively constructs a graph using a generative adversarial network \(GAN\) and reinforcement learning policies as guidance during training\[[40](https://arxiv.org/html/2605.16612#bib.bib20)\]\. GCPN gets its permutation invariance through its discriminator by only looking at the final generated structure, and its reward function does not rely on the order of atom placement\. Our method is distinct because, unlike GCPN, GSchNet, or GraphRNN, it works on crystals whose periodic structure means there is no canonical and physically meaningful ordering to predict atoms in\.

#### Material Generation Via Diffusion

Diffusion models are a popular choice forde novocrystal generation\[[39](https://arxiv.org/html/2605.16612#bib.bib3),[18](https://arxiv.org/html/2605.16612#bib.bib4),[23](https://arxiv.org/html/2605.16612#bib.bib19),[19](https://arxiv.org/html/2605.16612#bib.bib5),[28](https://arxiv.org/html/2605.16612#bib.bib6),[42](https://arxiv.org/html/2605.16612#bib.bib8),[7](https://arxiv.org/html/2605.16612#bib.bib9)\]\. CDVAE uses a variational autoencoder \(VAE\) to add the ability to do inverse design, the process of creating a material with a desired property, and then uses a diffusion model to denoise the output to produce the new material\[[39](https://arxiv.org/html/2605.16612#bib.bib3)\]\. Our method differs because, rather than using a VAE and optimizing for the property in latent space, we use autoregressive generation and directly condition on the desired properties, simplifying training and generation\. Another approach tode novocrystal generation is DiffCSP, a pure diffusion model which utilizes a periodicE\(3\)E\(3\)invariant GNN\[[18](https://arxiv.org/html/2605.16612#bib.bib4)\]\. An alternative to diffusion\-based methods is flow matching, with FlowMM leveraging Riemannian flow matching to simultaneously predict crystal structures and atom types\[[28](https://arxiv.org/html/2605.16612#bib.bib6)\]\. We also use Riemannian flow matching to predict the atom positions, but use autoregressive generation for atom types, as it allows for fewer generation steps\.

In crystals, symmetry is fully described by space groups, which classify all symmetry operations consistent with three\-dimensional lattice periodicity\. These space groups are highly relevant to chemical properties and the structure of the unit cell\. SymmCD decomposes the unit cell of a crystal into the asymmetric unit, the smallest unit that can reproduce the unit cell through these symmetry transformations, and predicts it with a diffusion model\[[23](https://arxiv.org/html/2605.16612#bib.bib19)\]\. MatterGen, DiffCSP\+\+, and SGEquiDiff are all diffusion models that use information about the space group when generating the crystal unit cell\[[42](https://arxiv.org/html/2605.16612#bib.bib8),[19](https://arxiv.org/html/2605.16612#bib.bib5),[7](https://arxiv.org/html/2605.16612#bib.bib9)\]\. Mattergen is conditioned on the space group number\. DiffCSP\+\+ uses the restrictions on atom counts and positions to help with diffusion, and SGEquiDiff is fully equivariant to the space group, using the restrictions on atoms and Wyckoff positions to place and diffuse the structure\. Interestingly, SGEquiDiff uses an autoregressive method to predict atom types and Wyckoff positions before diffusing their position, but enforces an ordering on the types of atoms predicted\. Our method is similar to SGEquiDiff, but it does not require a predefined atom ordering\. Instead, permutation invariance is enforced during training through the choice of loss function\. In addition, our approach does not incorporate any explicit information about the crystal’s space group, which increases our speed in inference\.

#### Autoregressive Material Generation

Some autoregressive methods use LLMs to predict novel materials by predicting a Crystallographic Information File \(CIF\)\[[1](https://arxiv.org/html/2605.16612#bib.bib10),[14](https://arxiv.org/html/2605.16612#bib.bib12),[35](https://arxiv.org/html/2605.16612#bib.bib7),[21](https://arxiv.org/html/2605.16612#bib.bib13)\]\. CrystaLLM trains an LLM from scratch on CIFs to predict novel CIFs\[[1](https://arxiv.org/html/2605.16612#bib.bib10)\]\. CrystaLLM\-π\\piexpands on CrystaLLM by adding in a system to pass property values directly into every layer of the transformer rather than in the text prompt\[[4](https://arxiv.org/html/2605.16612#bib.bib40)\]\. CrystalLLM uses a pretrained LLM and finetunes it on additional CIFs in order to produce novel CIFs\[[14](https://arxiv.org/html/2605.16612#bib.bib12)\]\. FlowLLM takes the CIFs output from CrystalLLM and uses a flow matching model to refine the outputs\[[35](https://arxiv.org/html/2605.16612#bib.bib7)\]\. CrysLLMGen is similar to FlowLLM but uses a diffusion model for refinement\[[21](https://arxiv.org/html/2605.16612#bib.bib13)\]\. LLMatDesign differs from others by using an LLM starting from an initial composition and design conditions and using it to autoregressively predict changes until the desired material properties are achieved\[[17](https://arxiv.org/html/2605.16612#bib.bib11)\]\. While our method is inspired by LLM\-style training and generation, it does not operate on language, as the CIF is a poorer representation of crystals than the unit cell\. Instead, it directly predicts atom types and does not employ causal masking, opting instead to enforce permutation invariance in the model design\.

## 3Background

In this section, we cover the necessary background information to understand PRISMat\. We overview a mathematical formulation of crystal unit cells, outline the process of autoregressive generation and Riemannian flow matching, provide a formal definition of equivariance, and cover the necessary background information relevant to our conditional slab generation task\.

### 3\.1Definition of a Crystal Cell and Crystal Slab

The defining characteristic of a crystal is the periodic structure of atoms\. Due to this periodicity, crystals can be compactly represented by a unit cell, a parallelepiped\-shaped subsection of the crystal\. The unit cell provides a computationally efficient representation of the infinitely repeating crystal structure\. The full crystal can be reconstructed by tiling the unit cell along all three lattice basis vectors\. A crystalCCis defined by a tupleC=\(L,A,X\)C=\(L,A,X\)whereL=\(l1,l2,l3\)∈ℝ3×3L=\(l\_\{1\},l\_\{2\},l\_\{3\}\)\\in\\mathbb\{R\}^\{3\\times 3\}are the three lattice vectors that define the periodic boundaries of the crystal,A=\(a1,a2,…,aN\)∈atomsNA=\(a\_\{1\},a\_\{2\},\\ldots,a\_\{N\}\)\\in atoms^\{N\}are the atom types, whereatomsatomsis a set of available elements,NNis the number of atoms in the unit cell,X=\(x1,x2,…,xN\)∈\[0,1\)N×3X=\(x\_\{1\},x\_\{2\},\\ldots,x\_\{N\}\)\\in\[0,1\)^\{N\\times 3\}are the fractional coordinates in the crystal which show the atoms’ positions in the unit cell as a fraction of the distance along the lattice vectors\. To get the Cartesian coordinates from the fractional coordinates and lattice vectors, we simply multiply the fractional coordinates by the lattice vectorsXLT=XCartesianXL^\{T\}=X\_\{\\mathrm\{Cartesian\}\}\.

A crystal slab can still be defined by a crystal tupleC=\(L,A,X\)C=\(L,A,X\), but slab structures inherit a brokenE\(3\)→SO\(2\)E\(3\)\\rightarrow SO\(2\)symmetry via cutting of the bulk crystal structure, creating a unique design challenge for fully equivariant models\. Crystal slabs are more reflective of realistic structures as they contain the termination that occurs at the boundary of a crystal\. Here, we consider two critical properties of crystal slabs: the cleavage energy and work function\. The cleavage energy is the amount of energy required to split the crystal along a specific Miller index\. The cleavage energy \(which is equal to the surface energy for symmetric slabs\) determines the stability of surfaces\. A slab has two work functions, one for the bottom surface of the slab and one for the top\. The work function is the amount of energy necessary to free an electron from the surface of a slab\.

### 3\.2Autoregressive Generation

Autoregressive generation is the process of predicting a sequence using the previously predicted elements of the sequence\. A modelppworks by producing a distribution over the tokens in the vocabularyVV,p\(vt\|v0,v1,…,vt−1\)p\(v\_\{t\}\|v\_\{0\},v\_\{1\},\\ldots,v\_\{t\-1\}\)wherevi∈Vv\_\{i\}\\in V, and selecting the next token from that distribution\.

The variance of the distributionppis controllable by changing the temperature,τ\\tau, and nucleus sampling,PP, hyperparameters\. The temperature modulates the softmax function\. Temperatures that are less than11increase the likelihood of more probable tokens, while temperatures that are greater than11increase the likelihood of less probable tokens from the original distribution\. Nucleus sampling cuts off very improbable tokens, which increases consistency\.

### 3\.3Equivariance

A functionf:X→Yf\\colon X\\rightarrow Yis equivariant to a groupGGif for anyx∈Xx\\in Xand anyg∈Gg\\in G,f\(gx\)=gf\(x\)f\(gx\)=gf\(x\)\. The functionffis invariant iff\(gx\)=f\(x\)f\(gx\)=f\(x\)\. In this work, we useE\(3\)E\(3\)equivariant GNNs whereE\(3\)E\(3\)is the group of all rotations, reflections, and translations overℝ3\\mathbb\{R\}^\{3\}\[[32](https://arxiv.org/html/2605.16612#bib.bib21)\]\.E\(3\)E\(3\)equivariance is a desired property when working with crystals, as the crystal’s properties are unchanged by any rotation or translation\. In crystal generation, the problem isE\(3\)E\(3\)invariant, as no matter the orientation, chirality, or position of the crystal, the target unit cell remains constant\.

### 3\.4Riemannian Flow Matching

Flow matching is the process of learning a vector field from some probability distribution, usually a normal distribution, to the true data distribution\[[25](https://arxiv.org/html/2605.16612#bib.bib22)\]\. A flow map is learned from data to noise, while generation reverses the process to go from noise to data\. Velocitiesutu\_\{t\}are predicted, going from the true data distribution to the noise distribution\. The velocities are computed at train time by taking a random sample from the random initial data distribution and the true data distribution and finding the displacement between them\.

Due to the periodicity inherent to crystal structures, it is necessary to define a manifold properly capturing this periodicity, on which the flow is learned\. Without this periodicity, it is possible to predict positions that fall outside the unit cell, requiring a postprocessing step to map external atoms back into the unit cell\. Riemannian flow matching extends the conventional flow matching algorithm to enable flows on general geometries\[[8](https://arxiv.org/html/2605.16612#bib.bib14)\]\. This allows us to implement flows that respect the periodicity of the unit cell\. To remove the redundant positions from our case, we use a 3\-torus manifold as it correctly implements the periodicity of the unit cell\.

## 4Method

![Refer to caption](https://arxiv.org/html/2605.16612v1/x2.png)Figure 2:The three stages of PRISMat\. We start by predicting the periodic boundaries of the unit cell\. We then autoregressively predict each atom until we reach an <end\> token\. Finally, we use flow matching to position each atom in the unit cell\. By splitting generation into three stages, we allow for more control over generation and faster rejection of impossible crystals\. Mandatory user inputs areτ\\tau,PP,maxAtomsmaxAtoms, andnumStepsnumSteps, while the policy and target materials properties for conditional generation are optional\. The pseudocode can be found in[algorithm˜2](https://arxiv.org/html/2605.16612#alg2)\.PRISMat is made up of three parts, each responsible for building a section of the crystal: theLatticeGenerator\\operatorname\{LatticeGenerator\}, a Gaussian mixture model which is responsible for generating the lattice vectors, theAtomGenerator\\operatorname\{AtomGenerator\}, anE\(3\)E\(3\)invariant GNN which autoregressively predicts the atoms in a crystal, and thePositionGenerator\\operatorname\{PositionGenerator\}, anE\(3\)E\(3\)equivariant GNN which runs a flow matching algorithm to predict the crystal structure\. In this section, we discuss each portion of PRISMat, how they operate, and why that method was selected\. A full overview of PRISMat can be seen in[fig\.˜2](https://arxiv.org/html/2605.16612#S4.F2)\.[Figure˜4](https://arxiv.org/html/2605.16612#A3.F4)showcases a real crystal from MP\-20 compared to a crystal PRISMat generated after being trained on MP\-20\.

### 4\.1LatticeGenerator\\operatorname\{LatticeGenerator\}

PRISMat starts by predicting the three lattice vectors that make up the periodic boundary of the material\. We build theLatticeGenerator\\operatorname\{LatticeGenerator\}by training a Gaussian mixture model using expectation maximization over the lattice parameters from our training dataset\.LatticeGenerator\\operatorname\{LatticeGenerator\}is therefore a distribution that we can sample from to get a set of lattice parameterslattice∼LatticeGeneratorlattice\\sim\\operatorname\{LatticeGenerator\}\. Since periodic boundaries are mutually dependent, we model them jointly using a Gaussian mixture overℝ3×3\\mathbb\{R\}^\{3\\times 3\}\. The periodic boundaries also determine the size of the unit cell, so we reject any periodic boundaries that produce unit cells of size less than10Å310\\mathrm\{\\AA \}^\{3\}\.

Periodic boundaries partially determine the bonding angles of the atoms on the edge of the unit cell, so by predicting the periodic boundaries first, we provide some chemical information about the atoms that should belong in the unit cell\. Because of this, we provide the lattice information to all subsequent parts of PRISMat as it will help determine atom types and their positions\. Additionally, it also enables more design choices by allowing us to restrict boundaries to specific crystal systems\.

### 4\.2AtomGenerator\\operatorname\{AtomGenerator\}

After predicting the lattice vectors, we autoregressively predict the atom types\. We define two virtual nodes besides the standard atom nodes, a <start\> node and an <end\> node\. The <start\> node is used to prompt the model to generate additional nodes, while the <end\> node is used to stop generation and implicitly determine the number of atoms in the unit cell\.

After the start node, we autoregressively generate a sequence of atoms types until we generate an <end\> token\. As input,AtomGenerator\\operatorname\{AtomGenerator\}takes the lattice parameters sampled fromLatticeGenerator\\operatorname\{LatticeGenerator\}and a <start\> token\. If the atomsa0,…,at−1a\_\{0\},\\ldots,a\_\{t\-1\}have already been generated, then the modelAtomGenerator\\operatorname\{AtomGenerator\}predicts the distribution of next possible atom typespt=AtomGenerator⁡\(lattice,start,a0,a1,…,at−1\)p\_\{t\}=\\operatorname\{AtomGenerator\}\(lattice,start,a\_\{0\},a\_\{1\},\\ldots,a\_\{t\-1\}\)\. We sample from this distribution to get the next atom typeat∼pta\_\{t\}\\sim p\_\{t\}\. We then use this predicted atom in the next step of atom generation\. This repeats untilat=ENDa\_\{t\}=END\. We use an E\(3\) invariant GNN, as neither the orientation of the input lattice vectors nor the order of the input atoms should affect the final output\[[32](https://arxiv.org/html/2605.16612#bib.bib21)\]\. The input atoms are represented as a fully connected graph where the lattice parameters are added into the node features\.

#### Permutation Invariant Autoregression\.

![Refer to caption](https://arxiv.org/html/2605.16612v1/x3.png)Figure 3:A visualization of the training input and labels provided for a fictitious nickel\-titanium crystal\. The true distribution has been shortened to only show <start\>, nickel, titanium, and <end\> in that order\. Despite the order in which the atoms are added to the training input, stepst0t\_\{0\},t1t\_\{1\}, andt3t\_\{3\}have the same true label as they have the same input atoms, whereas there are two possible orderings shown for stept2t\_\{2\}\. By design, the final chemical composition is identical for both permutations\.To train the autoregressive atom predictor, rather than imposing an implicit ordering on the atoms it predicts, we instead predict a categorical distribution over all atoms that remain in the unit cell at each time step\. If no atoms remain in the unit cell, we predict a distribution that is 100% the <end\> token\. Since this distribution remains the same no matter how the remaining atoms in the crystal are ordered, training becomes permutation invariant\. In training, we minimize the KL\-divergence loss between the output distribution ofAtomGenerator\\operatorname\{AtomGenerator\}and the true categorical distribution of the remaining atoms\[[22](https://arxiv.org/html/2605.16612#bib.bib39)\]\. Generally, if we are learning to predict a crystalC=\(L,A,X\)C=\(L,A,X\), then the modelAtomGenerator\\operatorname\{AtomGenerator\}would take some subset of the atomsa⊆Aa\\subseteq Aand learn to predict the categorical distribution of the remaining atomsa¯=A∖a\\bar\{a\}=A\\setminus a\. Compared to the training algorithms of LLMs or SGEquiDiff where the training target is a single token, our distributional approach makes it clear that there can be multiple feasible tokens at each training step\. A proof showcasing our permutation invariant training can be found in[proposition˜1](https://arxiv.org/html/2605.16612#Thmproposition1)\. A full description of the training algorithm can be found in[algorithm˜1](https://arxiv.org/html/2605.16612#alg1)\. To illustrate this process,[fig\.˜3](https://arxiv.org/html/2605.16612#S4.F3)illustrates a fictitious Ni2Ti2crystal and shows two possible permutations and the corresponding true labels \(distributions\) that ensure that the final outcomes are identical\.

#### Policy Guidance

Part of the benefit of autoregressive generation is that fewer overall steps are needed to generate the atom types compared to diffusion models\. This allows for faster rejection and resampling based on policy guidance\. We try three policy methods: 1\)partial policywhere we reject if any step of the partially generated result seems unrealistic during atom generation, 2\)full policywhere we reject after the atom generation step if it seems like an unrealistic set of atoms and lattice parameters, and 3\)SMACTwhere we use the SMACT solver on the final chemical composition to reject if it is not charge\-balanced, assuming known oxidation states of individual elements\[[11](https://arxiv.org/html/2605.16612#bib.bib42)\]\.

To train the partial and full policy methods, we utilize anE\(3\)E\(3\)invariant GNN and train it on the MP\-20 dataset, where crystals in the MP\-20 dataset are treated as realistic examples, and fake examples are made by removing, adding, or changing atoms in those realistic examples\. The models are trained to classify whether a given crystal is real or fake based on the involved atoms and lattice parameters\. Any crystals these policies mark fake are rejected during generation\.

### 4\.3PositionGenerator\\operatorname\{PositionGenerator\}

The final step of our procedure is a Riemannian flow matching step to determine atom positions\. We use anE\(3\)E\(3\)equivariant GNN to constructPositionGenerator\\operatorname\{PositionGenerator\}\[[32](https://arxiv.org/html/2605.16612#bib.bib21)\]\. We instantiate our positions byXi∼uniform⁡\(\[0,1\)N×3\)X\_\{i\}\\sim\\operatorname\{uniform\}\(\[0,1\)^\{N\\times 3\}\)and then run Riemannian flow matching using a 3\-torus as our manifold\. We utilize flow matching as it is the most efficient method of position generation, and because positioning is a global problem, global flow matching is most effective\.

To trainPositionGenerator\\operatorname\{PositionGenerator\}for each crystalC=\(L,A,X\)C=\(L,A,X\)in our dataset, we generate a set of random positionsX′∼uniform\(\[0,1\)NX^\{\\prime\}\\sim\\operatorname\{uniform\}\(\[0,1\)^\{N\}and compute the velocitiesV=X′−XV=X^\{\\prime\}\-X\. We then sample a random timestept∼uniform⁡\(\[0,1\]\)t\\sim\\operatorname\{uniform\}\(\[0,1\]\)and compute a positionXin=\(1−t\)X\+tX′X\_\{in\}=\(1\-t\)X\+tX^\{\\prime\}\. Finally we trainPositionGenerator\\operatorname\{PositionGenerator\}to minimize‖PositionGenerator⁡\(L,A,Xin,t\)−V‖22\|\|\\operatorname\{PositionGenerator\}\(L,A,X\_\{in\},t\)\-V\|\|^\{2\}\_\{2\}\.

### 4\.4Property\-Targeted Generation

PRISMat has the ability to be conditioned to produce materials with desired properties\. We change the Gaussian mixture model in the first step to a conditional one that changes means and standard deviations by applying Gaussian conditioning to each component and reweighting the mixture so the result is a new Gaussian mixture model representing the distribution of target variables given observed ones\[[36](https://arxiv.org/html/2605.16612#bib.bib38)\]\. Additionally, for the GNNs in the atom and position generation steps, we concatenate the desired property values to the input features to allow property\-targeted generation\.

## 5Experiments

In this section, we examine the benefits of our method and show how permutation invariance, autoregressive generation, policy guidance, and flow matching enable a fast and efficient model that can generate novel materials\. We evaluate our performance over the MP\-20 subset of the materials project and Lemat\-GenBench, as they are a standard crystal training dataset and evaluation set forde novogeneration\[[16](https://arxiv.org/html/2605.16612#bib.bib23),[39](https://arxiv.org/html/2605.16612#bib.bib3),[3](https://arxiv.org/html/2605.16612#bib.bib41)\]\. Additionally, we evaluate the performance of PRISMat over conditional generation of crystal slabs using the dataset fromSchindleret al\.\[[33](https://arxiv.org/html/2605.16612#bib.bib30)\]to show that our model is able to handle larger, more realistic structures and conditional generation better than other autoregressive systems with conditional generation\. We find that PRISMat is able to be successfully policy driven while maintaining the necessary speed to be the fastest autoregressive model\.

Table 1:Comparison of PRISMat versus an ablation using the same GNN setup but trained to learn permutation invariance rather than using the permutation invariant loss function\. We see that PRISMat outperforms the ablation in almost all categories\.### 5\.1Ablation

To show that permutation invariance improves performance, we run an ablation of our method where we remove the permutation invariance and instead have the model learn permutation invariance by randomly shuffling the order of the input and output atoms at every training step\. Under these conditions, we see a drop in performance from MSUN of 1\.36% to a MSUN of 1\.00%\. Additionally, the ablated method also underperforms on all distributional and validity categories, as shown in[table˜1](https://arxiv.org/html/2605.16612#S5.T1)\. This drop is significant as it means it would take much longer to find materials in high\-throughput applications\. This likely means that other methods, like SGEquiDiff, would also see an improvement in performance if they utilized a permutation invariant approach to atom generation\.

Table 2:Comparison of PRISMat for various hyperparameters, trained on the MP\-20 dataset\. The best sampling parameters areτ=0\.7\\tau=0\.7andP=0\.9P=0\.9as it produces the highest MSUN rate\.
### 5\.2Controllability

Autoregressive generation can be modified by two sampling parameters: temperatureτ\\tauand nucleus samplingPP\. We test different values of these parameters to see their effect on the generated samples and MSUN rate in[table˜2](https://arxiv.org/html/2605.16612#S5.T2)\. We find that the sampling parametersτ=0\.7\\tau=0\.7andP=0\.9P=0\.9produce the highest MSUN rate\. A lower temperature makes the model predict less diverse structures as it decreases the chance the model will pick something out of distribution\. The nucleus sampling helps remove some of the noise from the model’s output, also helping it produce more consistent results\. We see this reflected in the lower unique and novel structures and the improved distance metrics\.

Table 3:Comparison of different policy methods for rejection sampling\. Time is computed by generating 1000 structures and dividing the total time to run the program by the number of structures generated\. SMACT policy method performs the best with the highest MSUN, but partial policy generation has a lower distance to true data\. It is likely that the partial policy was overfitting and is making the model generate structures that are in the training data\.
### 5\.3Policy Guided Generation

We test our three different policy setups to filter the partially generated structures\. Since our method is autoregressive, rejecting partially generated structures makes more sense than in a denoising setup as we are confident that the partially generated structure will appear in the final result\.[Table˜3](https://arxiv.org/html/2605.16612#S5.T3)shows that SMACT policy rejection works best, while the other two methods actually reduce MSUN\. This is likely because the methods are prematurely cutting off potentially stable crystals because they fall out of distribution compared to MP\-20, and implies that maybe a more sophisticated policy model trained on a more diverse dataset is necessary\. SMACT works best because it explicitly enforces physical laws on the chosen atoms to ensure that they are properly charge\-balanced\.

Part of the benefit of PRISMat is the stepwise development of crystals\. Spending a maximum of 20 autoregressive steps is much less than the 250 or more steps required for flow matching or diffusion models\. Additionally, our model generates much cleaner intermediaries faster than LLMs\. This means that rejection of samples happens faster and more reliably\. Moreover, the addition of physics\-informed policies shows that adding physics\-informed systems like machine learning interatomic potentials may improve the generation performance of other models\.

### 5\.4De NovoGeneration

Table 4:The following models were all trained on MP\-20 and evaluated using LeMat\-GenBench\[[3](https://arxiv.org/html/2605.16612#bib.bib41)\]\. The parameter counts were taken fromChanget al\.\[[7](https://arxiv.org/html/2605.16612#bib.bib9)\]\. Inference time is the time to predict one sample averaged over one thousand samples on a single NVIDIA RTX 2080 Ti GPU\. FlowLLM was run on two NVIDIA L40S GPUs and CrysLLMGen was run on a single L40S, as required due to the size of the LLM, but both of their diffusion or flow matching steps were run on an RTX 2080 Ti\. For system type, AR means autoregressive, FM means flow matching, and DF means diffusion\. PRISMat is the fastest autoregressive system in terms of seconds per metastable, unique, and novel crystal\.We analyze the performance across a variety of models trained on the MP\-20 dataset\. We focus on finding the model that performs best under time/MSUN\. This metric matters most in industry, where speed means more possible materials to be tested in a given time span\. While some models may make more MSUN structures, the extra time it takes means you could produce more metastable, unique, and novel structures by running other methods for longer\.[Table˜4](https://arxiv.org/html/2605.16612#S5.T4)shows that our model is the third best under time/MSUN and the best out of all autoregressive models\.

### 5\.5Conditional Generation

Table 5:Performance of each autoregressive system on the slab dataset\. Cleavage energy and work function are the average difference between the true and the predicted cleavage energy and work function by FIRE\-GNN\[[15](https://arxiv.org/html/2605.16612#bib.bib28)\]\. Time was computed by taking the total time to generate one slab per cleavage energy and work function combo in the test dataset and then dividing by the total slabs generated while being run on an NVIDIA L40s\. Failure rate is the number of unparseable CIFs generated by each model\. Time/Slab is the number of seconds to generate a successful slab\.We validate PRISMat on a dataset of crystal slabs annotated with cleavage energy and work function\[[34](https://arxiv.org/html/2605.16612#bib.bib29)\]\. This dataset poses a unique challenge: each bulk crystal can yield up to 13 distinct slabs, expanding≈\\approx3,000 bulk structures into≈\\approx33,000 slabs with up to 90 atoms each\. Moreover, surface properties like the work function and structures are absent from bulk benchmarks, making this a more realistic generation domain\.

We baseline PRISMat against various LLM techniques, which enable conditioned generation\. For each set of cleavage energy and work functions found in the testing dataset, we generate a candidate slab per model and evaluate the performance of the generated slab based on how close its actual cleavage energy and work function are to the desired cleavage energy and work function\. To compute this, we use a machine learning method called FIRE\-GNN\[[15](https://arxiv.org/html/2605.16612#bib.bib28)\]\.

Table[5](https://arxiv.org/html/2605.16612#S5.T5)showcases the performance of our model and baselines\. It shows that PRISMat performs the best at producing slabs with properties close to the true value\. Additionally, we are the fastest among all the models at generating successful slabs\.

## 6Conclusion

We propose PRISMat, an efficient and flexible generative method for guided generation of novel materials, which efficiently incorporates permutation invariance into the generative process by predicting over a space of probabilities on atom types\. In standard benchmarks on the MP\-20 dataset, PRISMat demonstrates comparable performance to existing state\-of\-the\-art baselines, while retaining a clear advantage in computational efficiency necessary for high\-throughput screening tasks\. Furthermore, we demonstrate how PRISMat can be steered by policy during the autoregressive atom generation stage, allowing finer control over generated materials\. We additionally demonstrate how PRISMat is flexible, allowing for extensions beyond bulk crystal generation, by performing a novel conditional generation task on crystal slabs\.

#### Limitations and Future Work

One of the limitations of our method is that we use a separate model for each of the three steps in our generation process which does increase complexity, but it allows for more control over generation, including selecting the choice of periodic boundaries and enabling policy\-guided generation\. Future work may incorporate additional policy and physics\-informed techniques into the generation pipeline\.

## References

- \[1\]\(2024\)Crystal structure generation with autoregressive large language modeling\.Nature Communications15\(1\),pp\. 10570\.Cited by:[§2](https://arxiv.org/html/2605.16612#S2.SS0.SSS0.Px3.p1.1)\.
- \[2\]A\. Bellucci, D\. M\. Trucchi, G\. Segev, R\. Jacobs, D\. Morgan, J\. Booske, M\. Hasan, R\. Zulkharnay, N\. Fox, P\. W\. May, L\. Ang, C\. Franey, M\. Ghashami, A\. Mezzi, J\. Schwede, N\. A\. Loubet, K\. Bezdjian, E\. Lopez, A\. Datas, M\. Jalili, A\. Nojeh, E\. Rahman, X\. Zhang, P\. Schindler, E\. D\. Juette, V\. P\. Carey, J\. Fleurial, M\. Mastellone, G\. Zheng, L\. Wang, X\. Gang, and H\. Qiu\(2026\)The 2025 thermionic converters roadmap\.Journal of Physics D: Applied Physics\.External Links:[Link](http://iopscience.iop.org/article/10.1088/1361-6463/ae611e)Cited by:[§1](https://arxiv.org/html/2605.16612#S1.p1.1)\.
- \[3\]S\. Betala, S\. P\. Gleason, A\. Ramlaoui, A\. Xu, G\. Channing, D\. Levy, C\. Fourrier, N\. Kazeev, C\. K\. Joshi, S\. Kaba, F\. Therrien, A\. Hernandez\-Garcia, R\. Mercado, N\. M\. A\. Krishnan, and A\. Duval\(2026\)LeMat\-genbench: a unified evaluation framework for crystal generative models\.External Links:2512\.04562,[Link](https://arxiv.org/abs/2512.04562)Cited by:[Table 4](https://arxiv.org/html/2605.16612#S5.T4),[Table 4](https://arxiv.org/html/2605.16612#S5.T4.14.2),[§5](https://arxiv.org/html/2605.16612#S5.p1.1)\.
- \[4\]C\. Bone, M\. Walker, K\. Leng, L\. M\. Antunes, R\. Grau\-Crespo, A\. Aligayev, J\. Dominguez, and K\. T\. Butler\(2025\)Discovery and recovery of crystalline materials with property\-conditioned transformers\.External Links:2511\.21299,[Link](https://arxiv.org/abs/2511.21299)Cited by:[§1](https://arxiv.org/html/2605.16612#S1.p1.1),[§2](https://arxiv.org/html/2605.16612#S2.SS0.SSS0.Px3.p1.1),[Table 5](https://arxiv.org/html/2605.16612#S5.T5.10.10.10.1)\.
- \[5\]G\. Çankaya and N\. Uçar\(2004\)Schottky barrier height dependence on the metal work function for p\-type si schottky diodes\.Zeitschrift für Naturforschung A59\(11\),pp\. 795–798\.External Links:[Link](https://doi.org/10.1515/zna-2004-1112),[Document](https://dx.doi.org/doi%3A10.1515/zna-2004-1112)Cited by:[§1](https://arxiv.org/html/2605.16612#S1.p1.1)\.
- \[6\]C\. H\. Chan, M\. Sun, and B\. Huang\(2022\)Application of machine learning for advanced material prediction and design\.EcoMat4\(4\),pp\. e12194\.Cited by:[§1](https://arxiv.org/html/2605.16612#S1.p1.1)\.
- \[7\]R\. Chang, A\. Pak, A\. Guerra, N\. Zhan, N\. Richardson, E\. Ertekin, and R\. P\. Adams\(2025\)Space group equivariant crystal diffusion\.InThe Thirty\-ninth Annual Conference on Neural Information Processing Systems,External Links:[Link](https://openreview.net/forum?id=NWP8KYKC0c)Cited by:[§1](https://arxiv.org/html/2605.16612#S1.p1.1),[§1](https://arxiv.org/html/2605.16612#S1.p2.1),[§2](https://arxiv.org/html/2605.16612#S2.SS0.SSS0.Px2.p1.1),[§2](https://arxiv.org/html/2605.16612#S2.SS0.SSS0.Px2.p2.1),[Table 4](https://arxiv.org/html/2605.16612#S5.T4),[Table 4](https://arxiv.org/html/2605.16612#S5.T4.11.11.21.9.1),[Table 4](https://arxiv.org/html/2605.16612#S5.T4.14.2)\.
- \[8\]R\. T\. Q\. Chen and Y\. Lipman\(2024\)Flow matching on general geometries\.InThe Twelfth International Conference on Learning Representations,External Links:[Link](https://openreview.net/forum?id=g7ohDlTITL)Cited by:[§3\.4](https://arxiv.org/html/2605.16612#S3.SS4.p2.1)\.
- \[9\]Z\. Chen, T\. Ma, W\. Wei, W\. Wong, C\. Zhao, and B\. Ni\(2024\)Work function\-guided electrocatalyst design\.Advanced Materials36\(29\),pp\. 2401568\.External Links:[Document](https://dx.doi.org/https%3A//doi.org/10.1002/adma.202401568),[Link](https://advanced.onlinelibrary.wiley.com/doi/abs/10.1002/adma.202401568),https://advanced\.onlinelibrary\.wiley\.com/doi/pdf/10\.1002/adma\.202401568Cited by:[§1](https://arxiv.org/html/2605.16612#S1.p1.1)\.
- \[10\]S\. Chibani and F\. Coudert\(2020\)Machine learning approaches for the prediction of materials properties\.APL Mater\.8\(8\),pp\. 080701\.Cited by:[§1](https://arxiv.org/html/2605.16612#S1.p1.1)\.
- \[11\]D\. W\. Davies, K\. T\. Butler, A\. J\. Jackson, J\. M\. Skelton, K\. Morita, and A\. Walsh\(2019\)SMACT: semiconducting materials by analogy and chemical theory\.Journal of Open Source Software4\(38\),pp\. 1361\.External Links:[Document](https://dx.doi.org/10.21105/joss.01361),[Link](https://doi.org/10.21105/joss.01361)Cited by:[§4\.2](https://arxiv.org/html/2605.16612#S4.SS2.SSS0.Px2.p1.1)\.
- \[12\]J\. L\. Freeouf and J\. M\. Woodall\(1981\-11\)Schottky barriers: an effective work function model\.Applied Physics Letters39\(9\),pp\. 727–729\.External Links:ISSN 0003\-6951,[Document](https://dx.doi.org/10.1063/1.92863),[Link](https://doi.org/10.1063/1.92863),https://pubs\.aip\.org/aip/apl/article\-pdf/39/9/727/18445004/727\_1\_online\.pdfCited by:[§1](https://arxiv.org/html/2605.16612#S1.p1.1)\.
- \[13\]N\. Gebauer, M\. Gastegger, and K\. Schütt\(2019\)Symmetry\-adapted generation of 3d point sets for the targeted discovery of molecules\.InAdvances in Neural Information Processing Systems 32,H\. Wallach, H\. Larochelle, A\. Beygelzimer, F\. d'Alché\-Buc, E\. Fox, and R\. Garnett \(Eds\.\),pp\. 7566–7578\.External Links:[Link](http://papers.nips.cc/paper/8974-symmetry-adapted-generation-of-3d-point-sets-for-the-targeted-discovery-of-molecules.pdf)Cited by:[§2](https://arxiv.org/html/2605.16612#S2.SS0.SSS0.Px1.p1.1)\.
- \[14\]N\. Gruver, A\. Sriram, A\. Madotto, A\. G\. Wilson, C\. L\. Zitnick, and Z\. W\. Ulissi\(2024\)Fine\-tuned language models generate stable inorganic materials as text\.InInternational Conference on Learning Representations 2024,Cited by:[§1](https://arxiv.org/html/2605.16612#S1.p2.1),[§2](https://arxiv.org/html/2605.16612#S2.SS0.SSS0.Px3.p1.1),[Table 5](https://arxiv.org/html/2605.16612#S5.T5.10.10.11.1.1)\.
- \[15\]C\. Hsu, C\. Schlesinger, K\. Mudaliar, J\. Leung, R\. Walters, and P\. Schindler\(2025\-12\)FIRE\-GNN: Force\-informed, Relaxed Equivariance Graph Neural Network for Rapid and Accurate Prediction of Surface Properties\.Advanced Intelligent Discovery0\(0\),pp\. 0\.External Links:[Document](https://dx.doi.org/10.1002/aidi.202500162)Cited by:[§5\.5](https://arxiv.org/html/2605.16612#S5.SS5.p2.1),[Table 5](https://arxiv.org/html/2605.16612#S5.T5),[Table 5](https://arxiv.org/html/2605.16612#S5.T5.13.2)\.
- \[16\]A\. Jain, S\. P\. Ong, G\. Hautier, W\. Chen, W\. D\. Richards, S\. Dacek, S\. Cholia, D\. Gunter, D\. Skinner, G\. Ceder, and K\. a\. Persson\(2013\)The materials project: a materials genome approach to accelerating materials innovation\.APL Materials1\(1\),pp\. 011002\.External Links:[Document](https://dx.doi.org/10.1063/1.4812323),ISSN 2166532X,[Link](http://link.aip.org/link/AMPADS/v1/i1/p011002/s1%5C&Agg=doi)Cited by:[Figure 4](https://arxiv.org/html/2605.16612#A3.F4),[Figure 4](https://arxiv.org/html/2605.16612#A3.F4.3.2),[§5](https://arxiv.org/html/2605.16612#S5.p1.1)\.
- \[17\]S\. Jia, C\. Zhang, and V\. Fung\(2024\)LLMatDesign: autonomous materials discovery with large language models\.External Links:2406\.13163,[Link](https://arxiv.org/abs/2406.13163)Cited by:[§2](https://arxiv.org/html/2605.16612#S2.SS0.SSS0.Px3.p1.1)\.
- \[18\]R\. Jiao, W\. Huang, P\. Lin, J\. Han, P\. Chen, Y\. Lu, and Y\. Liu\(2023\)Crystal structure prediction by joint equivariant diffusion\.InThirty\-seventh Conference on Neural Information Processing Systems,External Links:[Link](https://openreview.net/forum?id=DNdN26m2Jk)Cited by:[§2](https://arxiv.org/html/2605.16612#S2.SS0.SSS0.Px2.p1.1),[Table 4](https://arxiv.org/html/2605.16612#S5.T4.11.11.14.2.1)\.
- \[19\]R\. Jiao, W\. Huang, Y\. Liu, D\. Zhao, and Y\. Liu\(2024\)Space group constrained crystal generation\.InThe Twelfth International Conference on Learning Representations,External Links:[Link](https://openreview.net/forum?id=jkvZ7v4OmP)Cited by:[§1](https://arxiv.org/html/2605.16612#S1.p1.1),[§2](https://arxiv.org/html/2605.16612#S2.SS0.SSS0.Px2.p1.1),[§2](https://arxiv.org/html/2605.16612#S2.SS0.SSS0.Px2.p2.1),[Table 4](https://arxiv.org/html/2605.16612#S5.T4.11.11.15.3.1)\.
- \[20\]S\. Kadulkar, Z\. M\. Sherman, V\. Ganesan, and T\. M\. Truskett\(2022\)Machine learning–assisted design of material properties\.Annu\. Rev\. Chem\. Biomol\. Eng\.13\(2022\),pp\. 235–254\.Cited by:[§1](https://arxiv.org/html/2605.16612#S1.p1.1)\.
- \[21\]S\. Khastagir, K\. DAS, P\. Goyal, S\. Lee, S\. Bhattacharjee, and N\. Ganguly\(2025\)LLM meets diffusion: a hybrid framework for crystal material generation\.InThe Thirty\-ninth Annual Conference on Neural Information Processing Systems,External Links:[Link](https://openreview.net/forum?id=E6gwPtWjb1)Cited by:[§2](https://arxiv.org/html/2605.16612#S2.SS0.SSS0.Px3.p1.1),[Table 4](https://arxiv.org/html/2605.16612#S5.T4.11.11.19.7.1)\.
- \[22\]S\. Kullback and R\. A\. Leibler\(1951\)On Information and Sufficiency\.The Annals of Mathematical Statistics22\(1\),pp\. 79 – 86\.External Links:[Document](https://dx.doi.org/10.1214/aoms/1177729694),[Link](https://doi.org/10.1214/aoms/1177729694)Cited by:[§4\.2](https://arxiv.org/html/2605.16612#S4.SS2.SSS0.Px1.p1.7)\.
- \[23\]D\. Levy, S\. S\. Panigrahi, S\. Kaba, Q\. Zhu, M\. Galkin, S\. Miret, and S\. Ravanbakhsh\(2024\)SymmCD: symmetry\-preserving crystal generation with diffusion models\.InAI for Accelerated Materials Design \- NeurIPS 2024,External Links:[Link](https://openreview.net/forum?id=V7x2KZQn2v)Cited by:[§2](https://arxiv.org/html/2605.16612#S2.SS0.SSS0.Px2.p1.1),[§2](https://arxiv.org/html/2605.16612#S2.SS0.SSS0.Px2.p2.1),[Table 4](https://arxiv.org/html/2605.16612#S5.T4.11.11.16.4.1)\.
- \[24\]L\. Lin, R\. Jacobs, T\. Ma, D\. Chen, J\. Booske, and D\. Morgan\(2023\-03\)Work function: fundamentals, measurement, calculation, engineering, and applications\.Phys\. Rev\. Appl\.19,pp\. 037001\.External Links:[Document](https://dx.doi.org/10.1103/PhysRevApplied.19.037001),[Link](https://link.aps.org/doi/10.1103/PhysRevApplied.19.037001)Cited by:[§1](https://arxiv.org/html/2605.16612#S1.p1.1)\.
- \[25\]Y\. Lipman, R\. T\. Q\. Chen, H\. Ben\-Hamu, M\. Nickel, and M\. Le\(2023\)Flow matching for generative modeling\.InThe Eleventh International Conference on Learning Representations,External Links:[Link](https://openreview.net/forum?id=PqvMRDCJT9t)Cited by:[§1](https://arxiv.org/html/2605.16612#S1.p3.2),[§3\.4](https://arxiv.org/html/2605.16612#S3.SS4.p1.1)\.
- \[26\]G\. Ma, Y\. Wang, D\. Lim, S\. Jegelka, and Y\. Wang\(2024\)A canonicalization perspective on invariant and equivariant learning\.External Links:2405\.18378,[Link](https://arxiv.org/abs/2405.18378)Cited by:[§1](https://arxiv.org/html/2605.16612#S1.p2.1)\.
- \[27\]T\. Maxson and T\. Szilvási\(2025\)Metal\-support interactions reshape nanoparticle catalyst surfaces\.Angewandte Chemie Novit1\(1\),pp\. e70008\.External Links:[Document](https://dx.doi.org/https%3A//doi.org/10.1002/anov.70008),[Link](https://onlinelibrary.wiley.com/doi/abs/10.1002/anov.70008),https://onlinelibrary\.wiley\.com/doi/pdf/10\.1002/anov\.70008Cited by:[§1](https://arxiv.org/html/2605.16612#S1.p1.1)\.
- \[28\]B\. K\. Miller, R\. T\. Q\. Chen, A\. Sriram, and B\. M\. Wood\(2024\)FlowMM: generating materials with riemannian flow matching\.InForty\-first International Conference on Machine Learning,External Links:[Link](https://openreview.net/forum?id=W4pB7VbzZI)Cited by:[§2](https://arxiv.org/html/2605.16612#S2.SS0.SSS0.Px2.p1.1),[Table 4](https://arxiv.org/html/2605.16612#S5.T4.11.11.17.5.1)\.
- \[29\]H\. Radinger, V\. Trouillet, F\. Bauer, and F\. Scheiba\(2022\)Work function describes the electrocatalytic activity of graphite for vanadium oxidation\.ACS Catalysis12\(10\),pp\. 6007–6015\.External Links:[Document](https://dx.doi.org/10.1021/acscatal.2c00334),[Link](https://doi.org/10.1021/acscatal.2c00334),https://doi\.org/10\.1021/acscatal\.2c00334Cited by:[§1](https://arxiv.org/html/2605.16612#S1.p1.1)\.
- \[30\]B\. Rhodes, S\. Vandenhaute, V\. Šimkus, J\. Gin, J\. Godwin, T\. Duignan, and M\. Neumann\(2025\)Orb\-v3: atomistic simulation at scale\.External Links:2504\.06231,[Link](https://arxiv.org/abs/2504.06231)Cited by:[§1](https://arxiv.org/html/2605.16612#S1.p1.1)\.
- \[31\]J\. Riebesell, R\. E\. A\. Goodall, P\. Benner, Y\. Chiang, B\. Deng, G\. Ceder, M\. Asta, A\. A\. Lee, A\. Jain, and K\. A\. Persson\(2025\-06\)A framework to evaluate machine learning crystal stability predictions\.Natural Machine Intelligence7\(6\),pp\. 836–847\(en\)\.External Links:[Document](https://dx.doi.org/10.1038/s42256-025-01055-1)Cited by:[§1](https://arxiv.org/html/2605.16612#S1.p1.1)\.
- \[32\]V\. G\. Satorras, E\. Hoogeboom, and M\. Welling\(2021\-18–24 Jul\)E\(n\) equivariant graph neural networks\.InProceedings of the 38th International Conference on Machine Learning,M\. Meila and T\. Zhang \(Eds\.\),Proceedings of Machine Learning Research, Vol\.139,pp\. 9323–9332\.External Links:[Link](https://proceedings.mlr.press/v139/satorras21a.html)Cited by:[§1](https://arxiv.org/html/2605.16612#S1.p3.2),[§3\.3](https://arxiv.org/html/2605.16612#S3.SS3.p1.12),[§4\.2](https://arxiv.org/html/2605.16612#S4.SS2.p2.7),[§4\.3](https://arxiv.org/html/2605.16612#S4.SS3.p1.3)\.
- \[33\]P\. Schindler, E\. R\. Antoniuk, G\. Cheon, Y\. Zhu, and E\. J\. Reed\(2024\)Discovery of stable surfaces with extreme work functions by high\-throughput density functional theory and machine learning\.Advanced Functional Materials34\(19\),pp\. 2401764\.External Links:[Document](https://dx.doi.org/https%3A//doi.org/10.1002/adfm.202401764),[Link](https://advanced.onlinelibrary.wiley.com/doi/abs/10.1002/adfm.202401764),https://advanced\.onlinelibrary\.wiley\.com/doi/pdf/10\.1002/adfm\.202401764Cited by:[3rd item](https://arxiv.org/html/2605.16612#S1.I1.i3.p1.1),[§5](https://arxiv.org/html/2605.16612#S5.p1.1)\.
- \[34\]P\. Schindler\(2024\-02\)Work function and cleavage energy dataset of paper "Discovery of stable surfaces with extreme work functions by high\-throughput density functional theory and machine learning"\.Zenodo\(eng\)\.External Links:[Link](https://zenodo.org/records/10703249)Cited by:[§5\.5](https://arxiv.org/html/2605.16612#S5.SS5.p1.2)\.
- \[35\]A\. Sriram, B\. K\. Miller, R\. T\. Q\. Chen, and B\. M\. Wood\(2024\)FlowLLM: flow matching for material generation with large language models as base distributions\.InNeurIPS 2024,External Links:LinkCited by:[§2](https://arxiv.org/html/2605.16612#S2.SS0.SSS0.Px3.p1.1),[Table 4](https://arxiv.org/html/2605.16612#S5.T4.11.11.20.8.1)\.
- \[36\]Cgmm: conditional gaussian mixture models for pythonExternal Links:[Link](https://github.com/sitmo/cgmm)Cited by:[§4\.4](https://arxiv.org/html/2605.16612#S4.SS4.p1.1)\.
- \[37\]B\. M\. Wood, M\. Dzamba, X\. Fu, M\. Gao, M\. Shuaibi, L\. Barroso\-Luque, K\. Abdelmaqsoud, V\. Gharakhanyan, J\. R\. Kitchin, D\. S\. Levine, K\. Michel, A\. Sriram, T\. Cohen, A\. Das, A\. Rizvi, S\. J\. Sahoo, Z\. W\. Ulissi, and C\. L\. Zitnick\(2025\)UMA: a family of universal models for atoms\.External Links:2506\.23971,[Link](https://arxiv.org/abs/2506.23971)Cited by:[§1](https://arxiv.org/html/2605.16612#S1.p1.1)\.
- \[38\]G\. Wulff\(1901\)XXV\. zur frage der geschwindigkeit des wachsthums und der auflösung der krystallflächen\.Zeitschrift für Kristallographie \- Crystalline Materials34\(1\-6\),pp\. 449–530\.External Links:[Link](https://doi.org/10.1524/zkri.1901.34.1.449),[Document](https://dx.doi.org/doi%3A10.1524/zkri.1901.34.1.449)Cited by:[§1](https://arxiv.org/html/2605.16612#S1.p1.1)\.
- \[39\]T\. Xie, X\. Fu, O\. Ganea, R\. Barzilay, and T\. Jaakkola\(2021\)Crystal diffusion variational autoencoder for periodic material generation\.arXiv preprint arXiv:2110\.06197\.Cited by:[§2](https://arxiv.org/html/2605.16612#S2.SS0.SSS0.Px2.p1.1),[Table 4](https://arxiv.org/html/2605.16612#S5.T4.11.11.13.1.1),[§5](https://arxiv.org/html/2605.16612#S5.p1.1)\.
- \[40\]J\. You, B\. Liu, R\. Ying, V\. Pande, and J\. Leskovec\(2018\)Graph convolutional policy network for goal\-directed molecular graph generation\.InProceedings of the 32nd International Conference on Neural Information Processing Systems,NIPS’18,Red Hook, NY, USA,pp\. 6412–6422\.Cited by:[§2](https://arxiv.org/html/2605.16612#S2.SS0.SSS0.Px1.p1.1)\.
- \[41\]J\. You, R\. Ying, X\. Ren, W\. Hamilton, and J\. Leskovec\(2018\-10–15 Jul\)GraphRNN: generating realistic graphs with deep auto\-regressive models\.InProceedings of the 35th International Conference on Machine Learning,J\. Dy and A\. Krause \(Eds\.\),Proceedings of Machine Learning Research, Vol\.80,pp\. 5708–5717\.External Links:[Link](https://proceedings.mlr.press/v80/you18a.html)Cited by:[§2](https://arxiv.org/html/2605.16612#S2.SS0.SSS0.Px1.p1.1)\.
- \[42\]C\. Zeni, R\. Pinsler, D\. Zügner, A\. Fowler, M\. Horton, X\. Fu, Z\. Wang, A\. Shysheya, J\. Crabbé, S\. Ueda, R\. Sordillo, L\. Sun, J\. Smith, B\. Nguyen, H\. Schulz, S\. Lewis, C\. Huang, Z\. Lu, Y\. Zhou, H\. Yang, H\. Hao, J\. Li, C\. Yang, W\. Li, R\. Tomioka, and T\. Xie\(2025\)A generative model for inorganic materials design\.Nature\.External Links:[Document](https://dx.doi.org/10.1038/s41586-025-08628-5)Cited by:[§1](https://arxiv.org/html/2605.16612#S1.p1.1),[§2](https://arxiv.org/html/2605.16612#S2.SS0.SSS0.Px2.p1.1),[§2](https://arxiv.org/html/2605.16612#S2.SS0.SSS0.Px2.p2.1),[Table 4](https://arxiv.org/html/2605.16612#S5.T4.11.11.18.6.1)\.
- \[43\]A\. R\. Zeradjanin, A\. Vimalanandan, G\. Polymeros, A\. A\. Topalov, K\. J\. J\. Mayrhofer, and M\. Rohwerder\(2017\)Balanced work function as a driver for facile hydrogen evolution reaction – comprehension and experimental assessment of interfacial catalytic descriptor\.Phys\. Chem\. Chem\. Phys\.19,pp\. 17019–17027\.External Links:[Document](https://dx.doi.org/10.1039/C7CP03081A),[Link](http://dx.doi.org/10.1039/C7CP03081A)Cited by:[§1](https://arxiv.org/html/2605.16612#S1.p1.1)\.

## Appendix ATraining and Sampling PRISMat

Input

η\\eta
Initialize

θ\\theta
while

θ\\thetais not convergeddo

for

C=\{L,A,X\}C=\\\{L,A,X\\\}in

DtrainD\_\{\\text\{train\}\}do

forEvery subsets

a,a¯a,\\bar\{a\}in

AAdo

if

a¯==∅\\bar\{a\}==\\emptysetthen

a¯←END\\bar\{a\}\\leftarrow END

endif

pa¯=1\|𝐚\|∑j=1\|𝐚\|𝐞ajp\_\{\\bar\{a\}\}=\\frac\{1\}\{\|\\mathbf\{a\}\|\}\\sum\_\{j=1\}^\{\|\\mathbf\{a\}\|\}\\mathbf\{e\}\_\{a\_\{j\}\}

pa=AtomGeneratorθ⁡\(start,L,a\)p\_\{a\}=\\operatorname\{AtomGenerator\}\_\{\\theta\}\(start,L,a\)

ℓ=DKL\(pa\|\|pa¯\)\\ell=D\_\{KL\}\(p\_\{a\}\|\|p\_\{\\bar\{a\}\}\)

θ=θ−η∇θℓ\\theta=\\theta\-\\eta\\nabla\_\{\\theta\}\\ell

endfor

endfor

endwhile

Algorithm 1TrainingAtomGenerator\\operatorname\{AtomGenerator\}Input

numSteps,maxAtomsnumSteps,maxAtoms
L∼LatticeGenerator⁡\(ℝ3×3\)L\\sim\\operatorname\{LatticeGenerator\}\(\\mathbb\{R\}^\{3\\times 3\}\)

Initialize

A=\[\],end=FalseA=\[\],end=False
whilenot

endendand

AA\.length\(\) <

maxAtomsmaxAtomsdo

a=AtomGenerator⁡\(start,L,A\)a=\\operatorname\{AtomGenerator\}\(start,L,A\)

if

aais

ENDENDthen

end=Trueend=True

else

A←aA\\leftarrow a

endif

endwhile

N=len\(atoms\)N=len\(atoms\)

Initialize

Xuniform⁡\(\[0,1\)N×3\)X~\\operatorname\{uniform\}\(\[0,1\)^\{N\\times 3\}\)
for

t=1t=1to

0in

numStepsnumStepsdo

v=PositionGenerator⁡\(L,A,X,t\)v=\\operatorname\{PositionGenerator\}\(L,A,X,t\)

X=X−v/numStepsX=X\-v/numSteps

endfor

return

L,A,XL,A,X

Algorithm 2Sampling from PRISMat
## Appendix BProof of Permutation\-Invariant Training

###### Proposition 1

Given a crystalC=\(L,A,X\)C=\(L,A,X\), for any list of input atomsain⊂Aa\_\{in\}\\subset Aand output atomsaout=A∖aina\_\{out\}=A\\setminus a\_\{in\}, no matter the permutationsπ1,π2\\pi\_\{1\},\\pi\_\{2\}applied toaina\_\{in\}andaouta\_\{out\},AtomGenerator\\operatorname\{AtomGenerator\}would receive the same training update\.

###### Proof 1

Letp\(aout\)p\(a\_\{out\}\)be the categorical distribution ofaouta\_\{out\}\. Sinceppis a distribution, it is invariant to any permutation of atomsp\(aout\)=p\(π2\(aout\)\)p\(a\_\{out\}\)=p\(\\pi\_\{2\}\(a\_\{out\}\)\)as it only depends on the set whichπ2\\pi\_\{2\}doesn’t affect\. The modelAtomGenerator\\operatorname\{AtomGenerator\}is permutation invariant by construction as GNNs ignore node ordering due to aggregating over all nodes, soAtomGenerator⁡\(L,start,ain\)=AtomGenerator⁡\(L,start,π1\(ain\)\)\\operatorname\{AtomGenerator\}\(L,start,a\_\{in\}\)=\\operatorname\{AtomGenerator\}\(L,start,\\pi\_\{1\}\(a\_\{in\}\)\)\. Therefore each training update will be permutation invariant as the loss of each step will be permutation invariantDKL\(AtomGenerator\(L,start,ain\),p\(aout\)\)=DKL\(AtomGenerator\(L,start,π1\(ain\)\)\),p\(π2\(aout\)\)\)D\_\{KL\}\(\\operatorname\{AtomGenerator\}\(L,start,a\_\{in\}\),p\(a\_\{out\}\)\)=D\_\{KL\}\(\\operatorname\{AtomGenerator\}\(L,start,\\pi\_\{1\}\(a\_\{in\}\)\)\),p\(\\pi\_\{2\}\(a\_\{out\}\)\)\)\.

## Appendix CExample Crystals

![Refer to caption](https://arxiv.org/html/2605.16612v1/x4.png)\(a\)Example of a real Ga4Te4crystal from MP\-20
![Refer to caption](https://arxiv.org/html/2605.16612v1/x5.png)\(b\)Example of a generated Re3Ti2crystal before relaxation\.
![Refer to caption](https://arxiv.org/html/2605.16612v1/x6.png)\(c\)Example of a generated Re3Ti2crystal after relaxation\.

Figure 4:Examples of real and generated crystals visualized by the Material Project’s crystal toolkit\[[16](https://arxiv.org/html/2605.16612#bib.bib23)\]\.
PRISMat: Policy-Driven, Permutation-Invariant Autoregressive Material Generation

Similar Articles

PRISM: Position-encoded Regressive Inverse Spectral Model for Multilayer Thin-Film Design

CRMA: A Spectrally-Bounded Backbone for Modular Continual Fine-Tuning of LLMs

PRISM: A Geometric Risk Bound that Decomposes Drift into Scale, Shape, and Head

@xbresson: How do we design materials with AI? Excited to introduce Crys-JEPA, a new generative technique in collaboration w/ @liu…

CrystalReasoner: Reasoning and RL for Property-Conditioned Crystal Structure Generation

Submit Feedback

Similar Articles

PRISM: Position-encoded Regressive Inverse Spectral Model for Multilayer Thin-Film Design
CRMA: A Spectrally-Bounded Backbone for Modular Continual Fine-Tuning of LLMs
PRISM: A Geometric Risk Bound that Decomposes Drift into Scale, Shape, and Head
@xbresson: How do we design materials with AI? Excited to introduce Crys-JEPA, a new generative technique in collaboration w/ @liu…
CrystalReasoner: Reasoning and RL for Property-Conditioned Crystal Structure Generation