Tag
A user shares their redesign of the 'AI Engineering from Scratch' website, which serves as a reference manual explaining AI concepts like transformers and backpropagation from raw mathematical implementations.
Angeliki Giannou, co-inventor of Looped Transformers, has successfully defended her PhD thesis and is set to begin a new role. Congratulations were shared by Dimitris Papailiopoulos on social media.
FormalSLT is a Lean 4 library that formally proves finite-sample statistical learning theory results (ERM, VC bounds, Rademacher bounds, PAC-Bayes, etc.) with explicit assumptions and zero sorry statements, providing a machine-checked foundation for ML theory.
Scientists used a machine learning algorithm to analyze TESS data, identifying over 10,000 new exoplanet candidates, potentially tripling the known count. One candidate was confirmed as a hot Jupiter, validating the method.
Researchers propose applying the "global ignition" consciousness mechanism from cognitive science to long-context engineering, introducing the MiA-Signature method that uses submodular selection of high-level concepts to cover the activation space. Applied to RAG and agentic systems, it delivers consistent performance improvements across multiple long-context tasks.
A comprehensive, open-source GitHub repository providing structured learning roadmaps and curated resources for mastering AI, machine learning, deep learning, and large language models from beginner to advanced levels. Designed for students and professionals, it covers foundational concepts, programming frameworks, career tracks, and emerging AI topics.
Andrew Ng shares his Stanford CS229 lecture covering core machine learning mathematics, including locally weighted regression, maximum likelihood, logistic regression, and Newton's method, providing developers with a comprehensive guide to ML fundamentals.
Token AI releases a research paper introducing STAM, a new adaptive momentum optimizer designed to improve training stability and reduce memory usage compared to standard optimizers like AdamW.
This paper introduces SDFlow, a similarity-driven flow matching framework for time series generation that addresses exposure bias in autoregressive models. It achieves state-of-the-art performance and inference speedups by operating in the frozen VQ latent space with low-rank manifold decomposition.
This paper proposes a locality-aware private class identification approach and a reliable optimal transport-based method (ReOT) to address domain adaptation challenges under extreme label shift, particularly distinguishing shared from private classes.
This paper introduces Annotator Policy Models (APMs) by Apple, which use interpretability techniques to infer annotators' internal safety policies from their labeling behavior without requiring additional annotation effort. The authors demonstrate that APMs can accurately model these policies and distinguish between sources of annotation disagreement, such as operational failures, policy ambiguity, and value pluralism.
This paper presents a comprehensive benchmark for evaluating adversarial attacks and defenses in Graph Neural Networks, highlighting the need for standardized and fair experimental protocols.
This paper introduces MOSAIC, a method for module discovery in scientific time series that combines causal representation learning with sparse additive identifiable causal learning. It aims to recover interpretable latent variables and their associated observations without post-hoc alignment, validated on domains like molecular dynamics and climate data.
This paper introduces NM-PPG, a non-myopic active feature acquisition method using pathwise policy gradients to optimize sequential feature selection in costly prediction scenarios.
This paper introduces TIDE, a method that addresses the Rare Token and Contextual Collapse problems in LLMs by injecting token identity into every layer via Embedding Memory. The authors demonstrate theoretical and empirical improvements across language modeling and downstream tasks.
This paper identifies a critical 'model collapse' issue in standard fine-tuning for causal reasoning and proposes a semantic loss function with graph-based logical constraints to prevent it.
This paper introduces SPADE, a novel algorithm for drug discovery that efficiently identifies high-quality ligands from sparse data using only ~40 tests. It demonstrates superior sample efficiency and speed compared to deep learning and Bayesian optimization methods.
This paper introduces CopyCop, an algorithm for verifying ownership of Graph Neural Networks by detecting surrogate models even when they differ in architecture, weights, or output transformations.
This paper presents an end-to-end pipeline for identifying and forecasting green skill demand using online job postings from Mexico's automotive industry. It benchmarks 15 time-series forecasting models, finding transformer-based models like FEDformer and Informer perform best, and introduces a two-dimensional framework to classify skills by growth dynamics.
This research paper introduces Chainwash, a multi-step rewriting attack that effectively removes statistical watermarks from diffusion language model (LLaDA-8B-Instruct) outputs, reducing detection rates from 87.9% to 4.86% after five chained rewrites.