bioinformatics

#bioinformatics

SP-Mind: An Autonomous Reasoning Agent for Spatial Proteomics Analysis

arXiv cs.AI ↗ · 5d ago Cached

SP-Mind is an autonomous AI agent that unifies spatial proteomics analysis pipelines, converting natural-language queries into end-to-end analytical workflows without fine-tuning, and achieves state-of-the-art performance on the new SP-Bench benchmark.

0 favorites 0 likes

#bioinformatics

@arcinstitute: Imagine programming protein, DNA, and RNA systems like you would write computer code, or even by natural language promp…

X AI KOLs Following ↗ · 6d ago Cached

The Arc Institute introduces Proto, a high-level programming language for generative biology that allows programming of protein, DNA, and RNA systems via code or natural language AI prompts.

0 favorites 0 likes

#bioinformatics

The Transformer Pill

Reddit r/ArtificialInteligence ↗ · 2026-06-12

A reflection on the broad implications of transformer architectures beyond LLMs, including potential impacts on linguistics, genetics, and causal modeling, comparing their significance to the Haber-Bosch process.

0 favorites 0 likes

#bioinformatics

APCyc: Property-Informed Design of Cyclic Peptides via Automated Cyclization

arXiv cs.AI ↗ · 2026-06-12 Cached

APCyc is a target-aware generative framework that designs cyclic peptides with controlled physicochemical properties by explicitly modeling cyclization patterns and using Bayesian posterior guidance.

0 favorites 0 likes

#bioinformatics

LDARNet: DNA Adaptive Representation Network with Learnable Tokenization for Genomic Modeling

arXiv cs.CL ↗ · 2026-06-04 Cached

LDARNet is a 120M-parameter hierarchical genomic foundation model that introduces learnable adaptive tokenization (inspired by H-Net's dynamic chunking) for masked language modeling on DNA sequences. It achieves state-of-the-art results on 5 histone modification tasks and outperforms models up to 20× larger on several genomic benchmarks, with learned token boundaries aligning with biological features like promoter motifs and splice junctions.

0 favorites 0 likes

#bioinformatics

GENEB: Why Genomic Models Are Hard to Compare

arXiv cs.CL ↗ · 2026-06-04 Cached

GENEB is a large-scale diagnostic benchmark that evaluates 40 genomic foundation models across 100 tasks in 13 functional categories under a unified probing protocol, exposing that aggregate leaderboards are unstable and that architectural alignment often outweighs model scale. The work addresses the fragmented evaluation landscape in genomic machine learning, analogous to what MTEB did for NLP.

0 favorites 0 likes

#bioinformatics

Beyond Prompt-Based Planning: MCP-Native Graph Planning-based Biomedical Agent System

arXiv cs.AI ↗ · 2026-06-04 Cached

BioManus is an MCP-native biomedical agent system that uses graph-scaffolded planning over structured biological capabilities instead of flat prompt-based tool retrieval, achieving better context efficiency and execution accuracy on biomedical benchmarks. The system introduces a BioinfoMCP Compiler to standardize heterogeneous bioinformatics tools and organizes them as a typed heterogeneous MCP graph for scalable reasoning.

0 favorites 0 likes

#bioinformatics

@googleaidevs: Building autonomous agents for scientific discovery? @GoogleDeepMind Science Skills is now available on GitHub. We've o…

X AI KOLs Timeline ↗ · 2026-06-02 Cached

Google DeepMind has open-sourced Science Skills, a collection of agent skills for scientific research tasks including genomics, structural biology, and cheminformatics, to accelerate agentic workflows with scientific grounding and higher token efficiency.

0 favorites 0 likes

#bioinformatics

CellBRIDGE: Learning Cellular Trajectories via Interaction-Aware Alignment

arXiv cs.LG ↗ · 2026-06-01 Cached

CellBRIDGE is a new method that enhances optimal transport for scRNA-seq trajectory inference by incorporating ligand-receptor interaction costs to model cell-cell communication, improving alignment and enabling interpretable in silico perturbations.

0 favorites 0 likes

#bioinformatics

Beyond Agreement: Scoring Panel-Surfaced Biomedical Entity Candidates for Curator Triage

arXiv cs.CL ↗ · 2026-06-01 Cached

This paper introduces BioConCal, a supervised scorer that uses inference-time panel and candidate features to rank biomedical entity candidates surfaced by LLM panels, significantly improving over raw agreement for curator triage.

0 favorites 0 likes

#bioinformatics

@GallowayLabMIT: it is Atlases all the way down

X AI KOLs Timeline ↗ · 2026-05-30 Cached

A tweet announces the creation of an Atlas of Cell Atlases, a data resource for cell biology, with code to be released later.

0 favorites 0 likes

#bioinformatics

PROTOCOL: Late Interaction Retrieval for Protein Homolog Search

arXiv cs.LG ↗ · 2026-05-29 Cached

ProtoCol applies late-interaction retrieval to protein homology search, representing proteins as sets of residue embeddings and using MaxSim for scoring, outperforming pooled and alignment-based methods on remote homology benchmarks.

0 favorites 0 likes

#bioinformatics

TaxDistill: Improving Metagenomic Taxonomic Annotation via Distilled Genomic Foundation Models

arXiv cs.LG ↗ · 2026-05-29 Cached

TaxDistill proposes a knowledge distillation framework using a 500M parameter genomic foundation model (GenomeOcean) as a teacher to improve metagenomic taxonomic annotation by reducing label noise from similarity search tools, achieving significant F1 improvements on CAMI2 datasets.

0 favorites 0 likes

#bioinformatics

@iluciddreaming: A repo on GitHub with 138 Agent Skills for scientific research tools, 24.7k stars. Covers bioinformatics, drug discovery, clinical databases—Scanpy, RDKit, DeepChem, UniProt, AlphaFold all included. …

X AI KOLs Timeline ↗ · 2026-05-24 Cached

A GitHub repository called scientific-agent-skills contains 138 Agent Skills for scientific research tools, covering bioinformatics, drug discovery, clinical databases, and more. It supports one-line integration into AI agents, providing precise API calling methods.

0 favorites 0 likes

#bioinformatics

AgForce Enables Antigen-conditioned Generative Antibody Design

arXiv cs.LG ↗ · 2026-05-22 Cached

This paper identifies three failure modes in existing antibody design methods (antigen blindness, vocabulary collapse, convergence to marginal distribution) and proposes AgForce, a novel encoder-decoder architecture using graph neural networks and mixture density networks, achieving state-of-the-art binding quality and sequence recovery on the Chimera-Bench benchmark.

0 favorites 0 likes

#bioinformatics

@Xudong07452910: 24K stars, a cross-disciplinary research assistant project: 138 ready-to-use scientific agent skills that turn Claude Code/Codex into an AI scientist with one click!

X AI KOLs Timeline ↗ · 2026-05-21 Cached

A comprehensive open-source collection of 138 scientific agent skills that transform AI coding assistants like Claude Code and Codex into AI scientists, covering biology, chemistry, medicine, and more, with integration of over 100 scientific databases and specialized Python packages.

0 favorites 0 likes

#bioinformatics

Rosalind: A genomics toolkit in Rust running whole-genome pipelines on a laptop

Hacker News Top ↗ · 2026-05-21 Cached

Rosalind is a deterministic genomics engine in Rust designed to run whole-genome pipelines with O(√t) memory, enabling bioinformatics on laptops and edge devices.

0 favorites 0 likes

#bioinformatics

@GoogleDeepMind: How can you accelerate your day to day research workflow? By giving AI the right scientific toolkit. We launched Scienc…

X AI KOLs ↗ · 2026-05-21 Cached

Google DeepMind launched Science Skills, a toolkit that integrates over 30 major life science sources, including UniProt and the AlphaFold Database, to help accelerate research workflows.

0 favorites 0 likes

#bioinformatics

@ClementDelangue: The future of biology shouldn’t stay behind black-box APIs. Especially when it touches personal health. Whether you’re …

X AI KOLs Following ↗ · 2026-05-20 Cached

Hugging Face releases Carbon, an open-source DNA base model that is 275x faster than comparable models, enabling local processing of whole genomes on a single GPU.

0 favorites 0 likes

#bioinformatics

Carbon: Decoding the Language of Life

Reddit r/LocalLLaMA ↗ · 2026-05-19

Hugging Face released Carbon, a family of open DNA foundation models that matches state-of-the-art performance of Evo2-7B while being 275x faster, using 6-mer tokenization, factorized loss, and curated genomic data.

0 favorites 0 likes

bioinformatics

Submit Feedback