@THayes427: Also check out this @modal tutorial that walks through the underlying code from the notebook above with more detailed e…

X AI KOLs Following Tools

Summary

A Modal tutorial demonstrating how to scale protein binder design using ESMFold2 and ESMC models, with code for iterative optimization and autoscaling infrastructure.

@modal Also check out this @modal tutorial that walks through the underlying code from the notebook above with more detailed explanations for scaling up design: https://t.co/P4tlqjf3Td
Original Article
View Cached Full Text

Cached at: 06/03/26, 07:54 PM

@modal Also check out this @modal tutorial that walks through the underlying code from the notebook above with more detailed explanations for scaling up design: https://t.co/P4tlqjf3Td


Design protein binders at scale with ESMFold2 and ESMC

Source: https://modal.com/docs/examples/esmfold2_binder_design Protein folding was a landmark breakthrough in computational biology. But for many applications, we don’t just want to predict the structures of existing proteins — we want to design new proteins that can modulate biology.

One of the most important ways to do that is through binding. Protein-protein interactions drive much of biological function, and the ability to design molecules that bind specific targets opens the door to new research tools and therapeutics. Recent AI approaches have tackled binder design by inverting structure prediction models via an iterative optimization process:

  1. Fold a candidate binder together with the target protein.
  2. Score the resulting structure based on how well the binder folds and binds.
  3. Take a step in sequence space that improves the score.
  4. Repeat.

In this example, we’ll demonstrate how implement this process on Modal usingESMFold2 and ESMC, state-of-the-art models developed atBiohubthat can predict the stucture of biomolecular complexes. Check out theirtechnical reportto see how the models were developed and used to design and experimentally validate binders against therapeutically relevant targets.

We’ll start by building a Modal Function that designs a single binder; then with only a few more lines of code, we’ll write an orchestrator function that executes a large-scale search powered by Modal’s autoscaling infrastructure and global GPU capacity.

Setuphttps://modal.com/docs/examples/esmfold2_binder_design#setup

Defining our Modal Imagehttps://modal.com/docs/examples/esmfold2_binder_design#defining-our-modal-image

We’ll useImage\.micromambaas our base image because a few of the packages we need are only available via Conda. We’ll also install theesmlibrary from CZ Biohub (which pulls in a custom fork oftransformers) and a few other helpful libraries for working with protein sequences.

We setCUBLAS\_WORKSPACE\_CONFIGwhich allows us to ensure reproducibility by callingtorch\.use\_deterministic\_algorithms\(True\)at the top of our remote code.

Caching weights and persisting results on Modal Volumeshttps://modal.com/docs/examples/esmfold2_binder_design#caching-weights-and-persisting-results-on-modal-volumes

ESMFold2 builds on the 6B-parameter ESMC encoder; together with the four critic models used for final scoring, the model weights come in around ~50 GB. We cache them on aModal Volumewhich delivers much better performance at cold-start time than re-downloading from Hugging Face each time.

A second Volume will store our results.

Designing a binder on Modalhttps://modal.com/docs/examples/esmfold2_binder_design#designing-a-binder-on-modal

To run binder design on Modal, we define aBinderDesignServiceclass and wrap it with the@app\.clsdecorator. The decorator takes arguments that describe the infrastructure our code needs: the Image and both Volumes we defined, plus an H100 GPU which has enough memory for the 6B-parameter ESMC encoder and the four ESMFold2 “hero” critic models.

Inside the class, the@modal\.enter\(\)lifecycle hookdownloads and initializes those models once per container start, so subsequentdesigncalls on the same container reuse the loaded weights.

We decorate ourdesignmethod with@modal\.method\(\)to enable remote execution. We’ll see it called both via\.remote\(\)(single design) and via\.spawn\(\)+modal\.FunctionCall\.gather(parallel sweep) further below. The class itself is a thin wrapper aroundESMFold2Designerfrom the helper package, which handles the actual model loading and the gradient-guided optimization loop (design\_binderinbinder\_design\.design).

Fanning out a sweep with selectionhttps://modal.com/docs/examples/esmfold2_binder_design#fanning-out-a-sweep-with-selection

A single design run gives you one candidate per batch slot. To recover the kind of hit rates reported in the paper, you want many seeds, several binder templates, and several targets, then a selection pass that ranks designs by a combined ipTM / distogram-ipTM-proxy score.

We orchestrate from inside a Modal Function so you don’t have to worry about keeping a long-running process alive locally or installing any local dependencies.

From the command linehttps://modal.com/docs/examples/esmfold2_binder_design#from-the-command-line

mainruns a single design. Override thetarget\_name/binder\_nameto try one of thebundled targets(cd45,ctla4,egfr,pd\-l1,pdgfr) and binder templates (minibinder,trastuzumab\_framework\_vhvl,atezolizumab\_framework\_vhvl,ocankitug\_framework\_vhvl), or pass an arbitrarytarget\_sequence/binder\_sequencedirectly.

sweepruns a grid sweep across every\(target, binder, seed\)combination of the targets and binders you pass in, scaling design horizontally with Modal’sasynchronous job processing. The selection pass runs server-side and the resulting parquet is written to both theesmfold2\-binder\-design\-resultsVolume and to a local file for inspection.

target\_namesandbinder\_namesare passed as comma-separated strings. The defaults sweep one target across two binder modalities — aminibinderand thetrastuzumab\_framework\_vhvlantibody template — so a single command fans out across both at once:

Similar Articles

Biohub/esm

GitHub Trending (daily)

Biohub releases ESMC, ESMFold2, and ESM Atlas — a world model for protein biology enabling state-of-the-art prediction, design, and discovery across scales, including a billion-structure atlas.