Tag
Multiple AI model releases are delayed: GPT-5.6 now expected mid-July, DeepMind's 3.5 Pro postponed, while OpenAI's Bidi voice model and Claude Sonnet 5 for enterprises see progress.
Introducing the Perspective MCP, a tool that enables Claude to build conversion funnels with tracking, CRM, and auto optimization, built on experience from $1B in ad spend.
A guide on building a reusable Claude Code Agent loop that can be pointed at different tasks like bug fixing, speed optimization, or cost reduction by swapping check scripts.
最近两篇arXiv论文发现,GPT-5.4和Claude Opus 4.6在处理陌生编程语言时采用元编程策略(用Python生成目标代码并本地调试),而非直接编写目标语言代码。这一策略是区分顶级和普通agent的关键,且策略精巧度比模型参数规模更重要。
This post reports an observation that reading a long, structured text before answering alters a model's later responses, with behavioral evidence from Claude and mechanistic analysis on open-weight Gemma models showing separable hidden states and sharper probability distributions in instruction-tuned variants.
A tweet reports that the head of NSA and U.S. Cyber Command said the AI system Mythos breached most classified test systems in hours, not weeks.
According to reports, Claude Mythos breached most classified test systems of the NSA and U.S. Cyber Command within hours, explaining why Fable was shut down.
Anthropic updated its privacy policy to require some flagged Claude users to upload government-issued ID for identity verification, as part of an appeals process to avoid account bans, amid regulatory and White House pressures.
The article presents 'knowledge agents', a methodology that injects relevant knowledge into AI agents via a hybrid retrieval system, allowing smaller models to outperform large frontier models across specialized domains like financial markets, policy, and healthcare.
The author explains how to build a self-improving quant trading system using AI loop engineering, where the AI runs loops to prompt, verify, and act autonomously, contrasting with manual prompting.
The author shares an experience where a Claude AI agent, given permission to deploy to their production site several times daily, caught a mistake they had unknowingly made.
Anthropic has released a 33-page PDF guide, "The Complete Guide to Building Skills for Claude," which details how to design, organize, optimize, and reuse Claude's Skills. It is suitable for Claude Code users and AI Agent developers.
Fiona believes AI has raised the ceiling of achievement; an engineer unfamiliar with mobile development used Claude to fill in the App functionality.
This article introduces Stanford's STORM research method and provides 4 prompts that allow users to replicate a multi-perspective research process in Claude, generating a doctoral-level research briefing in 5 minutes.
PixelRAG is a novel open-source tool that bypasses traditional HTML parsing by directly taking screenshots of webpages and using vision models to extract answers from the pixels. It also supports the Claude Code plugin, giving Claude visual capabilities.
A discussion on the state of AI adoption in companies, questioning whether off-the-shelf tools or custom solutions are more successful.
Stanford's STORM research method was broken down by Nav into 4 prompts and put into Claude, enabling ordinary people to use AI to analyze problems from 5 independent perspectives, improving the efficiency of writing in-depth content.
An analysis of why tech companies prefer Claude over their own coding tools, highlighting its superior performance and versatility.
Box CEO Aaron Levie argues that AI agents will use software 100X more than people, requiring guardrails, authoritative data sources, logging, and collaboration features; platforms enabling headless interactions will be best positioned.
Announcement of Qwable-v1, an open-weights model distilled from Claude Fable-5, along with performance benchmarks on 2dgx sparks hardware achieving 25 tok/sec (single session) and 152 tok/sec (8 sessions).