Tag
TabPFN-3, a pre-trained tabular foundation model, was released with support for up to 1 million rows on a single GPU, 10x-1000x faster inference, and a 93% win rate over classical ML in benchmarks.
Microsoft Research announces MatterSim updates including MatterSim-MT, a multi-task foundation model for materials characterization, faster simulation (3-5x speedup), and experimental validation of thermal conductivity predictions for a new material.
MIT released FINGERS-7B, a 7-billion-parameter multi-omics foundation model trained on data from 30,000 individuals to predict Alzheimer's risk years in advance. The model is accessible via the AD Workbench and is accompanied by a research paper on OpenReview.
HiDream-ai has open-sourced HiDream-O1-Image (8B), a unified image generative foundation model built on a Pixel-level Unified Transformer (UiT) that natively handles text-to-image, image editing, and subject-driven personalization at up to 2048×2048 resolution without external VAEs or disjoint text encoders. It debuted at #8 in the Artificial Analysis Text to Image Arena and is positioned as a leading open-weights text-to-image model.
This paper proposes a new architecture that augments Flux Neural Operators with recurrent Vision Transformers to solve conservation laws as a foundation model. It demonstrates robust generalization and long-time prediction capabilities across diverse conservative systems without explicit access to governing equations.
This paper introduces the Neural Rule Inducer (NRI), a foundation model for zero-shot logical rule induction that uses domain-agnostic statistical properties to generalize across tasks without retraining.
Google open-sourced TimesFM 2.5, a 200 M-parameter, 16 k-context zero-shot time-series forecasting base model that works straight out of the box on historical data.
LingBot-Map is a feed-forward 3D foundation model for streaming 3D reconstruction that uses a Geometric Context Transformer architecture, achieving state-of-the-art performance with efficient ~20 FPS inference on long sequences exceeding 10,000 frames.
Introduces LingBot-Map, a feed-forward 3D foundation model for streaming 3D reconstruction using a geometric context transformer architecture that achieves stable real-time performance at 20 FPS.
Tencent releases HY-Embodied-0.5, a suite of foundation models designed for embodied AI agents featuring a Mixture-of-Transformers (MoT) architecture with efficient 2B and powerful 32B variants for real-world robot control and spatial-temporal reasoning.
TRIBE v2 is a new predictive foundation model designed to understand how the human brain processes complex stimuli.
Lightricks released LTX-2.3, an open-weight diffusion-based audio-video foundation model with improved quality and prompt adherence, available in multiple checkpoints including distilled and LoRA variants for local execution.
LTX-2 is introduced as an efficient joint audio-visual foundation model. The text includes a mix of the paper reference and a video script about countries facing existential threats, but the primary classification target is the AI model paper.
OpenAI releases GPT-5.1-Codex-Max, a frontier agentic coding model trained on software engineering tasks with native multi-context window support through compaction, designed to handle millions of tokens in a single task. The system card details comprehensive safety measures and preparedness framework evaluations across cybersecurity, biology, and AI self-improvement domains.
Google DeepMind introduces AlphaEarth Foundations, an AI model that integrates petabytes of Earth observation data into unified embeddings to map and monitor the planet at 10x10 meter resolution. The model's compact representations enable efficient planetary-scale analysis for applications in food security, deforestation tracking, and environmental monitoring.
Google DeepMind and Yale released C2S-Scale, a 27B parameter foundation model built on Gemma for single-cell analysis that discovered a promising drug combination (silmitasertib and interferon) to enhance immune visibility of "cold" tumors, with predictions validated through experimental confirmation.
OpenAI provides a first look at GPT-5, representing a major advancement in large language models with potential paradigm-shifting capabilities.
Kronos is a new foundation model for financial K-line data that uses a specialized tokenizer and autoregressive pre-training to outperform existing models in forecasting and synthetic data generation.
Google has developed DolphinGemma, a large language model designed to learn and generate dolphin vocalizations, collaborating with Georgia Tech and the Wild Dolphin Project to advance understanding of dolphin communication patterns and enable potential interspecies dialogue.
This article presents a research paper on Time-Series Foundation Model (TimeFM), a decoder-only model that achieves near-optimal zero-shot performance across diverse time-series datasets by adapting large language model techniques.