bioinformatics

#bioinformatics

@adithya_s_k: Wake up ppl Huggingface just open sourced Genomic Foundational Models

X AI KOLs Following ↗ · 2026-05-19 Cached

Huggingface has open-sourced genomic foundational models, including Carbon, a DNA model that is 275x faster than the next best model and can process the entire human genome on a single GPU in under 2 days.

0 favorites 0 likes

#bioinformatics

@lvwerra: We are releasing Carbon: a crazy fast DNA model Carbon is 275x faster than the next best model. So fast you can process…

X AI KOLs Following ↗ · 2026-05-19 Cached

HuggingFace releases Carbon, a DNA model that is 275x faster than the previous state-of-the-art (Evo2), enabling processing of the entire human genome on a single GPU in under two days. The model uses a unique tokenizer that splits sequences into 6-base chunks while maintaining single-base resolution, and comes with an interactive demo.

0 favorites 0 likes

#bioinformatics

@daweifs: One command to turn AI into your dedicated research scientist! Great news for researchers! This GitHub gem equips AI with 133 professional scientific research skills, covering bioinformatics, drug discovery, clinical, multi-omics, and more, boosting efficiency instantly. How powerful is it? 1. 133 ready-to-use skills covering biology/chemistry/medicine/materials...

X AI KOLs Timeline ↗ · 2026-05-16 Cached

This GitHub repository provides 135 ready-to-use scientific AI skills covering biology, chemistry, medicine, and other fields. They can be integrated into AI agents with one click to accelerate research workflows.

0 favorites 0 likes

#bioinformatics

scShapeBench: Discovering geometry from high dimensional scRNAseq data

arXiv cs.LG ↗ · 2026-05-14 Cached

Introduces scShapeBench, a benchmark dataset for shape detection in high-dimensional single-cell data, and scReebTower, a baseline method that uses diffusion geometry and Reeb graphs to classify data shapes into clusters, trajectories, multi-branches, and archetypes.

0 favorites 0 likes

#bioinformatics

Structural Interpretations of Protein Language Model Representations via Differentiable Graph Partitioning

arXiv cs.LG ↗ · 2026-05-13 Cached

This paper proposes SoftBlobGIN, a framework that enhances the interpretability of protein language model representations by projecting them onto contact graphs for structure-aware message passing. It demonstrates improved performance on enzyme classification and binding-site detection while providing auditable structural explanations.

0 favorites 0 likes

#bioinformatics

Towards Universal Gene Regulatory Network Inference: Unlocking Generalizable Regulatory Knowledge in Single-cell Foundation Models

arXiv cs.LG ↗ · 2026-05-12 Cached

This paper introduces a new paradigm for universal Gene Regulatory Network (GRN) inference using single-cell foundation models, proposing Virtual Value Perturbation and Gradient Trajectory methods to distill regulatory knowledge.

0 favorites 0 likes

#bioinformatics

Better Protein Function Prediction by Modeling Survivorship Bias

arXiv cs.LG ↗ · 2026-05-11 Cached

This paper introduces Evo-PU, a positive-unlabeled learning framework that models survivorship bias in protein sequence data by leveraging evolutionary mutation processes. The authors demonstrate that Evo-PU outperforms standard PU methods and protein language models in predicting protein functionality for influenza, RSV, and SARS-CoV-2.

0 favorites 0 likes

#bioinformatics

ProtSent: Protein Sentence Transformers

arXiv cs.LG ↗ · 2026-05-11 Cached

This article introduces ProtSent, a contrastive fine-tuning framework for protein language models that improves embedding quality for downstream tasks like remote homology detection and structural retrieval.

0 favorites 0 likes

#bioinformatics

Transformer-Based Wildlife Species Classification from Daily Movement Trajectories

arXiv cs.LG ↗ · 2026-05-11 Cached

This paper presents a Transformer-based model for classifying wildlife species using only daily GPS movement trajectories, demonstrating superior accuracy over LSTM and CNN baselines across different studies and regions.

0 favorites 0 likes

#bioinformatics

PlantMarkerBench: A Multi-Species Benchmark for Evidence-Grounded Plant Marker Reasoning

Hugging Face Daily Papers ↗ · 2026-05-11 Cached

This paper introduces PlantMarkerBench, a multi-species benchmark for evaluating language models' ability to interpret evidence for plant marker genes from scientific literature across four species. It highlights that while frontier models perform well on direct evidence, they struggle with functional and indirect evidence types.

0 favorites 0 likes

#bioinformatics

@omarsar0: This is just mindblowing stuff! I couldn't resist replicating this workflow to generate 3D biological structures. In a …

X AI KOLs Following ↗ · 2026-05-10

The author describes a workflow using Gemini Nano Pro, Tripo, and Codex to generate 3D biological structures, highlighting AI's potential to accelerate education.

0 favorites 0 likes

#bioinformatics

@AYi_AInotes: Damn, after watching this DeepMind documentary, I got goosebumps. There were no thrilling speeches or grand launch events—just an ordinary meeting room where a few engineers sat around a table and told Demis: We can now predict all the known 1.7 to 2 billion protein sequences in just one month. Dem…

X AI KOLs Timeline ↗ · 2026-05-10

The article reviews DeepMind's decision-making process behind open-sourcing AlphaFold in 2021, praising Demis Hassabis's leadership style for taking risks and freely making basic research available. It also notes that although this initiative did not generate direct profits, it successfully led to the creation of Isomorphic Labs, valued at $2 billion.

0 favorites 0 likes

#bioinformatics

TD3B: Transition-Directed Discrete Diffusion for Allosteric Binder Generation

Hugging Face Daily Papers ↗ · 2026-05-10 Cached

TD3B is a sequence-based generative framework for designing allosteric binders with specific agonist or antagonist behaviors using transition-directed discrete diffusion. The paper introduces a method to control directional transitions in protein states, addressing limitations of static structure-based design.

0 favorites 0 likes

#bioinformatics

Classification of Amino Acids

Hacker News Top ↗ · 2026-05-08

This content covers methodologies for categorizing amino acids, likely involving computational or biological analysis techniques.

0 favorites 0 likes

#bioinformatics

GATHER: Convergence-Centric Hyper-Entity Retrieval for Zero-Shot Cell-Type Annotation

arXiv cs.CL ↗ · 2026-05-08 Cached

This paper introduces GATHER, a convergence-centric retrieval method for zero-shot cell-type annotation using knowledge graphs, which improves accuracy and reduces LLM costs compared to existing KG-RAG baselines.

0 favorites 0 likes

#bioinformatics

Geometric coherence of single-cell CRISPR perturbations reveals regulatory architecture and predicts cellular stress

Hugging Face Daily Papers ↗ · 2026-04-17 Cached

This paper introduces Shesha, a geometric stability metric that quantifies directional coherence of single-cell CRISPR perturbation responses using mean cosine similarity, revealing regulatory architecture and predicting cellular stress across 2,200+ perturbations in five CRISPR datasets.

0 favorites 0 likes

#bioinformatics

Benchmarking Biology’s AI Agent: ML@B's Collaboration with LatchBio

ML at Berkeley ↗ · 2026-04-15

Machine Learning at Berkeley collaborated with LatchBio to benchmark their AI agent's performance on spatial transcriptomics workflows, evaluating its ability to automate complex bioinformatics tasks.

0 favorites 0 likes

#bioinformatics

Significance and Stability Analysis of Gene-Environment Interaction using RGxEStat

Hugging Face Daily Papers ↗ · 2026-04-03 Cached

The paper introduces RGxEStat, a lightweight interactive tool that applies mixed-effect models to analyze gene-environment interactions, offering breeders a user-friendly alternative to complex SAS/R programming.

0 favorites 0 likes

#bioinformatics

Training mRNA Language Models Across 25 Species for $165

Hugging Face Blog ↗ · 2026-03-31 Cached

This article details the development of CodonRoBERTa, a language model trained across 25 species for mRNA codon optimization, highlighting a cost-effective pipeline that includes protein folding and sequence design.

0 favorites 0 likes

#bioinformatics

Biohub/esm

GitHub Trending (daily) ↗ · 2026-05-29 Cached

Biohub releases ESMC, ESMFold2, and ESM Atlas — a world model for protein biology enabling state-of-the-art prediction, design, and discovery across scales, including a billion-structure atlas.

0 favorites 0 likes

bioinformatics

Submit Feedback