Tag
Linus Ekenstam explains his preference for using HTML instead of Markdown when building context for AI, citing broader training data availability for HTML.
该文章探讨了模型蒸馏的难度和成本,以DeepSeek R1蒸馏到Llama 3 8b和Qwen 2.5 7b为例,询问为何蒸馏模型不常见。
The article discusses the importance of quality control for reinforcement learning data, outlining the shortcomings of current data vendors and the evaluation criteria used by frontier AI labs for RL data.
AMD introduces the Instinct MI350P accelerator featuring CDNA 4 architecture in a PCIe form factor, though pricing and availability details are not yet announced.
OpenAI has released MRC (Multipath Reliable Connection), a novel networking protocol developed with industry partners to improve performance and resilience in large-scale AI training clusters. The specification was published via the Open Compute Project to standardize infrastructure for efficient supercomputer operations.
High cost of living in San Francisco pushes even high-earning physicians to take AI tutoring side gigs with companies like Mercor and Handshake.
Meta is mandating AI-training software on US employees’ work laptops that logs keystrokes and mouse movements, prompting internal backlash over privacy despite company claims of safeguards.
Meta is installing keystroke, mouse and screenshot monitoring software on employee PCs to gather real-world usage data for building AI agents, prompting internal unease.
Meta is deploying internal tracking software on US employees’ PCs to record mouse/keyboard actions and occasional screen snapshots, aiming to improve AI agents that automate workplace tasks.
Atlassian has enabled data collection by default to use customer data for training AI models, raising privacy concerns among enterprise users.
Teknium observes that the Hermes agent initially behaves inefficiently but gains large efficiency boosts after solving a task once, likening it to "linearized RL."
Commonwealth Bank of Australia is rolling out ChatGPT Enterprise to nearly 50,000 employees to build AI fluency across the organization and improve customer outcomes through improved workflows and agent-powered use cases.
OpenAI announces a new Residency program offering a six-month pathway to full-time employment for researchers and engineers transitioning into AI, with participants compensated as salaried employees and support for diverse, unconventional backgrounds.
OpenAI researchers discovered that the gradient noise scale, a simple statistical metric, predicts the parallelizability of neural network training across a wide range of tasks. They found that more complex tasks and more powerful models tolerate larger batch sizes, suggesting future AI systems can scale further through increased parallelization.
OpenAI releases benchmark results for OpenAI Five, their Dota 2 playing system, detailing training methodology across six major revisions with compute requirements ranging from 8 to 35 petaflop/s-days and introducing new network architecture tooling.
OpenAI Scholars is a mentorship and grant program for underrepresented groups in science and engineering to learn deep learning over a three-month period. Applications are open with rolling reviews starting March 14th and closing March 31st, 2018.
In the podcast, OpenAI discussed why AI training requires a new type of supercomputer network and introduced the Multipath Reliable Connection (MP-RC) protocol to address tail latency in synchronous workloads.