Tag
This paper introduces a residualization-and-permutation diagnostic to separate predictability-driven from regulation-driven variance in regulatory importance scores from genomic foundation models, applied to dark genome elements at glioma-relevant loci.
TaxDistill proposes a knowledge distillation framework using a 500M parameter genomic foundation model (GenomeOcean) as a teacher to improve metagenomic taxonomic annotation by reducing label noise from similarity search tools, achieving significant F1 improvements on CAMI2 datasets.