Tag
SP-Mind is an autonomous AI agent that unifies spatial proteomics analysis pipelines, converting natural-language queries into end-to-end analytical workflows without fine-tuning, and achieves state-of-the-art performance on the new SP-Bench benchmark.
The Arc Institute introduces Proto, a high-level programming language for generative biology that allows programming of protein, DNA, and RNA systems via code or natural language AI prompts.
A reflection on the broad implications of transformer architectures beyond LLMs, including potential impacts on linguistics, genetics, and causal modeling, comparing their significance to the Haber-Bosch process.
APCyc is a target-aware generative framework that designs cyclic peptides with controlled physicochemical properties by explicitly modeling cyclization patterns and using Bayesian posterior guidance.
LDARNet is a 120M-parameter hierarchical genomic foundation model that introduces learnable adaptive tokenization (inspired by H-Net's dynamic chunking) for masked language modeling on DNA sequences. It achieves state-of-the-art results on 5 histone modification tasks and outperforms models up to 20× larger on several genomic benchmarks, with learned token boundaries aligning with biological features like promoter motifs and splice junctions.
GENEB is a large-scale diagnostic benchmark that evaluates 40 genomic foundation models across 100 tasks in 13 functional categories under a unified probing protocol, exposing that aggregate leaderboards are unstable and that architectural alignment often outweighs model scale. The work addresses the fragmented evaluation landscape in genomic machine learning, analogous to what MTEB did for NLP.
BioManus is an MCP-native biomedical agent system that uses graph-scaffolded planning over structured biological capabilities instead of flat prompt-based tool retrieval, achieving better context efficiency and execution accuracy on biomedical benchmarks. The system introduces a BioinfoMCP Compiler to standardize heterogeneous bioinformatics tools and organizes them as a typed heterogeneous MCP graph for scalable reasoning.
Google DeepMind has open-sourced Science Skills, a collection of agent skills for scientific research tasks including genomics, structural biology, and cheminformatics, to accelerate agentic workflows with scientific grounding and higher token efficiency.
CellBRIDGE is a new method that enhances optimal transport for scRNA-seq trajectory inference by incorporating ligand-receptor interaction costs to model cell-cell communication, improving alignment and enabling interpretable in silico perturbations.
This paper introduces BioConCal, a supervised scorer that uses inference-time panel and candidate features to rank biomedical entity candidates surfaced by LLM panels, significantly improving over raw agreement for curator triage.
A tweet announces the creation of an Atlas of Cell Atlases, a data resource for cell biology, with code to be released later.
ProtoCol applies late-interaction retrieval to protein homology search, representing proteins as sets of residue embeddings and using MaxSim for scoring, outperforming pooled and alignment-based methods on remote homology benchmarks.
TaxDistill proposes a knowledge distillation framework using a 500M parameter genomic foundation model (GenomeOcean) as a teacher to improve metagenomic taxonomic annotation by reducing label noise from similarity search tools, achieving significant F1 improvements on CAMI2 datasets.
A GitHub repository called scientific-agent-skills contains 138 Agent Skills for scientific research tools, covering bioinformatics, drug discovery, clinical databases, and more. It supports one-line integration into AI agents, providing precise API calling methods.
This paper identifies three failure modes in existing antibody design methods (antigen blindness, vocabulary collapse, convergence to marginal distribution) and proposes AgForce, a novel encoder-decoder architecture using graph neural networks and mixture density networks, achieving state-of-the-art binding quality and sequence recovery on the Chimera-Bench benchmark.
A comprehensive open-source collection of 138 scientific agent skills that transform AI coding assistants like Claude Code and Codex into AI scientists, covering biology, chemistry, medicine, and more, with integration of over 100 scientific databases and specialized Python packages.
Rosalind is a deterministic genomics engine in Rust designed to run whole-genome pipelines with O(√t) memory, enabling bioinformatics on laptops and edge devices.
Google DeepMind launched Science Skills, a toolkit that integrates over 30 major life science sources, including UniProt and the AlphaFold Database, to help accelerate research workflows.
Hugging Face releases Carbon, an open-source DNA base model that is 275x faster than comparable models, enabling local processing of whole genomes on a single GPU.
Hugging Face released Carbon, a family of open DNA foundation models that matches state-of-the-art performance of Evo2-7B while being 275x faster, using 6-mer tokenization, factorized loss, and curated genomic data.