Tag
The article examines the current state of quantum computing, noting that no quantum computer has yet performed a useful task despite ambitious promises from the Trump administration and Microsoft's Majorana 2 chip announcement, which has drawn criticism from independent researchers for overhyping incremental progress.
An observation that future technologies often start as seemingly trivial toys before evolving into foundational infrastructure for civilization.
A reflective piece asking what recent AI developments would have seemed most unbelievable in 2020, and what future surprises might await.
This paper revisits the WorkBench benchmark for workplace agents two years after its initial release, showing that the best agent (Claude Opus 4.8) now completes 89% of tasks with only 2.5% harmful side effects, compared to GPT-4's 43% completion and 26% harm rate in 2024. It finds that capability and safety improve together, open-weight models have drastically lowered costs, and some basic mistakes persist.
Discussion of recent AI model scores on the 'humanity's last exam' benchmark, noting improvement from GPT-4o's 2.7% in May 2024 to around 45% by June 2026, questioning the exam's difficulty.
A reflection on AI or technology progress, noting that while growth may not be exponential, incremental progress is still valuable.
Discussion on whether AI agents are transitioning from impressive demos to genuinely useful tools in research, coding, operations, and personal productivity.
Paul Graham shares a link about exponential growth observed as early as 5000 BC.
Steeve Morin reports running Llama 3.1 3B on Tenstorrent hardware via ZML, achieving 26 tok/s, close to Tenstorrent's claimed 33 tok/s.
Discusses the persistent challenges that prevent AI agents from reliably handling real-world tasks, such as changing websites and inconsistent workflows, despite progress in task execution.
This paper introduces engagement forecasting for intelligent tutoring systems, predicting weekly minutes practiced and new skills mastered using interaction logs from 425 middle-school students. Feature-based models reduce error by 22-33% over heuristic baselines, offering explainable patterns for tutor-learner goal setting.
Fields Medalist Timothy Gowers reports using GPT5.5 Pro to solve open mathematical problems and predicts an imminent crisis in mathematical research due to rapid AI progress.