Tag
FrontierSmith automatically generates diverse open-ended coding problems from closed-ended tasks, improving LLM coding performance on benchmarks through enhanced agent interactions and training data synthesis.
Adaption launched AutoScientist, an AI tool that automates fine-tuning to help models learn capabilities quickly, aiming to make frontier AI training more accessible.
The article analyzes how SpaceX is emerging as a major compute provider for AI companies, with deals supplying GPUs to Anthropic and Cursor, and Google exploring orbital data centers through SpaceX.
The article argues that serious AI companies are moving from wrapping general models to training their own specialized models using proprietary interaction data, as specialisation now routinely matches or beats frontier models for in-distribution agentic tasks, driving better unit economics.
Adaption AI introduces AutoScientist, a tool that automates the full research loop to make model training more accessible outside of frontier labs.
Thinking Machines Lab is hiring supercomputing engineers in NYC and SF to build infrastructure for real-time interactive models and large-scale training.
A Hollywood screenwriter details the transition from TV writing to AI training gigs amidst industry instability following the 2023 strikes. The article highlights the harsh realities of the AI labor market, including red-teaming tasks and gig platform dynamics.
Linus Ekenstam explains his preference for using HTML instead of Markdown when building context for AI, citing broader training data availability for HTML.
该文章探讨了模型蒸馏的难度和成本,以DeepSeek R1蒸馏到Llama 3 8b和Qwen 2.5 7b为例,询问为何蒸馏模型不常见。
The article discusses the importance of quality control for reinforcement learning data, outlining the shortcomings of current data vendors and the evaluation criteria used by frontier AI labs for RL data.
AMD introduces the Instinct MI350P accelerator featuring CDNA 4 architecture in a PCIe form factor, though pricing and availability details are not yet announced.
Tendem by Toloka is a platform that connects AI developers with human experts for data annotation and training.
OpenAI has released MRC (Multipath Reliable Connection), a novel networking protocol developed with industry partners to improve performance and resilience in large-scale AI training clusters. The specification was published via the Open Compute Project to standardize infrastructure for efficient supercomputer operations.
High cost of living in San Francisco pushes even high-earning physicians to take AI tutoring side gigs with companies like Mercor and Handshake.
Meta is mandating AI-training software on US employees’ work laptops that logs keystrokes and mouse movements, prompting internal backlash over privacy despite company claims of safeguards.
Meta is installing keystroke, mouse and screenshot monitoring software on employee PCs to gather real-world usage data for building AI agents, prompting internal unease.
Meta is deploying internal tracking software on US employees’ PCs to record mouse/keyboard actions and occasional screen snapshots, aiming to improve AI agents that automate workplace tasks.
Atlassian has enabled data collection by default to use customer data for training AI models, raising privacy concerns among enterprise users.
Teknium observes that the Hermes agent initially behaves inefficiently but gains large efficiency boosts after solving a task once, likening it to "linearized RL."
Commonwealth Bank of Australia is rolling out ChatGPT Enterprise to nearly 50,000 employees to build AI fluency across the organization and improve customer outcomes through improved workflows and agent-powered use cases.