Tag
A tweet shares the insight that struggling in math, coding, or hard skills is typically due to missing prerequisites, not lack of talent, encouraging learners to fill gaps.
Kyle Kabasares claims to have used OpenAI's ChatGPT-5.5 Pro to generate a candidate counterexample to an open problem from Don Knuth's The Art of Computer Programming, and requests verification.
This paper introduces blockwise policy-drift gating, a lightweight method to improve on-policy distillation for language models by weighting loss based on old-current student probability shifts, achieving improved reasoning accuracy on math benchmarks.
The author used mathematical SDF in Shader Toy to create a tower composed of bricks and mortar, achieving circumferential and vertical repetition along with randomization.
An open-source project offering 503 lessons across 20 phases, teaching each algorithm from raw mathematical foundations before introducing any framework.
A guide on building a Voice AI capable of performing mathematical calculations and generating accurate quotes.
The VibeThinker-3B model achieves state-of-the-art math and coding reasoning performance, scoring 94.3 on AIME'26 and 96.1% on unseen LeetCode problems, demonstrating that small models can reach frontier-level reasoning in verifiable domains.
VibeThinker-3B is a 3B-parameter model that achieves frontier-level reasoning performance on math, coding, and STEM benchmarks by optimizing the Spectrum-to-Signal Principle (SSP) post-training pipeline, reaching performance comparable to much larger models.
The author shares a personal project implementing the Red & Black knights pattern from a Numberphile video, finding joy in the emergent patterns from a number spiral, and plans to optimize and extend it.
edulab adds analytic geometry problem type, supports random problem generation, dynamic geometry board (2D Canvas drawing curves, moving lines, moving points, vectors, etc.) and KaTeX step-by-step parsing. It is an update to the open-source education skill tool.
A short mathematical write-up on Principal Component Analysis (PCA), explaining the concept and its applications.
A 178-page survey study from the University of Huddersfield covering math and generative AI foundations, titled 'The Little Book of Generative AI Foundations'.
Identifies Supervision Fidelity Decay (SFD) in on-policy distillation, where teacher supervision degrades as student sequences lengthen, and proposes Lookahead Group Reward (LGR) to mitigate SFD, improving performance on math and code benchmarks.
A critique of the AI tutor Koji, highlighting flaws in its math teaching approach, such as allowing students to fumble without guidance and missing key conceptual explanations.
Anthropic's new AI model Claude Mythos, using the Claude Code framework, reportedly solved Erdős's distinct distances problem by finding alternative simple proofs, following OpenAI's earlier disproof. This demonstrates LLMs' ability to make independent scientific breakthroughs.
A thread highlights two separate insights: a Google researcher found that adding 'you are an MIT mathematician' to a prompt fixes math errors in LLMs, and Alex Albert explains how Anthropic trains Claude's personality. Both resources are free and offer deep dives into how LLMs actually work.
A chart summarizing recent math problems that AI models have successfully solved, highlighting progress in automated reasoning and symbolic mathematics.
HRM-Text introduces a Hierarchical Recurrent Model that decouples computation into slow and fast layers, enabling efficient pretraining from scratch on only 40 billion tokens and a $1,500 budget, achieving competitive performance with larger models.
The paper proposes a method using mismatched wrong drafts from a weaker model to elicit superior reasoning in a stronger learner via GRPO, achieving state-of-the-art results on Mathstral-7B for MATH-500 and AIME benchmarks.
Polypad is a free, interactive platform offering virtual manipulatives for math education, requiring no login and working across devices.