全部文章,按抓取时间从新到旧排列。
NousResearch releases Lighthouse Attention, a selection-based hierarchical attention that achieves 1.4-1.7x wall-clock speedup at 98K context and ~17x faster forward/backward pass than standard attention at 512K context on a single B200, validated on 530M-parameter Llama-3 models across 50B tokens.
Claude Mythos AI discovered a novel attack vector that bypassed Apple's M5 chip defense system in five days at a cost of $35K, producing a 55-page report delivered to Apple. The exploit poisons data ingested by the chip, evading Apple's MIE system.
The AI industry has created a new job role; details are provided in the linked article.
A developer with 20+ years of experience shares a pre-launch security and privacy checklist that AI app builders often skip, warning that launching without these checks creates liability.
A blackboard lecture by Eric Jang walks through building AlphaGo from scratch with modern AI tools, covering RL, MCTS, self-play, and connecting to LLM training, along with a discussion on automated AI research.
用户推荐一篇深入讲解agent循环、记忆机制、harness工程和agent测评的文章,强调其含金量,适合深入研究agent的读者。
Jack Dorsey 开源了 Bitchat 项目,这是一个无需互联网、利用蓝牙 Mesh 网络实现离线通讯和比特币转账的工具,支持多跳中继、Cashu eCash 离线转账和双重加密,适用于断网、监控等场景。
作者宣布在其多智能体工作流沙盒中推出免费的AI面试准备模块,列出了42道针对GenAI和智能体AI岗位的面试题及其优秀答案。
This article argues that the AI safety debate is misdirected, focusing on model alignment and internal controls instead of the critical boundary: external admission authority over agent execution. It warns that systems capable of self-authorizing high-impact actions (e.g., deploying code, moving money) pose a fundamental risk that logging and monitoring cannot mitigate.
SR8是一种工具,它能将人类或机器的原始意图编译成AI系统的结构化制品规范,通过在执行前形式化上下文、约束和成功标准,弥合了模糊请求与高质量输出之间的鸿沟。
讨论将AI代理从沙箱迁移到生产环境所面临的挑战,强调高敏感性导致大量噪声,并提出解决方案,如二级评估器、启发式方法和级联架构。同时向社区询问他们的过滤方法。
作者描述了一次在大学进行的关于AI Agent记忆局限性的演讲,并以克里斯托弗·诺兰的电影《记忆碎片》作为类比,解释为何AI Agent在记忆方面存在困难。
CETI项目使用大语言模型的架构解码抹香鲸的咔嗒声,揭示了其语音字母表,但也凸显出AI的统计模式匹配缺乏真正的理解。文章认为,AGI需要具身化、多模态的根基,而不仅仅是基于文本的模型扩展。
这篇文章批评了工作场所中AI生成内容的泛滥,员工使用Claude等工具来产出看似专业的内容,却缺乏真实的专业知识,导致管理和问责方面的系统性问题。
一名 Reddit 用户驳斥了 Seed IQ (AGX) 关于以满分解决 ARC-AGI-3 基准测试的声称,认为拒绝提交到允许闭源提交的 Kaggle 排行榜表明这是一个骗局。