Tag
A tweet observes that current social media influencers serve as training data for the next generation of AI-generated influencers.
A developer shares surprising lessons from fine-tuning a small open model, including that base models often already max out on intended improvements, the real weakness is behavior (caving), and fine-tuning requires careful measurement and balancing.
Meta employees are petitioning against the Model Capability Initiative (MCI), which collects computer-use data like keystrokes, mouse movements, and screen content for AI training, raising serious privacy and regulatory concerns.
The tweet observes that frontier AI labs are spending billions on hiring diverse professionals (poets, musicians, accountants, etc.) to annotate massive datasets, calling it a brute-force bet that seems to be working.
Advice on preventing artwork from being ingested by LLMs, covering options like not posting online, using login walls, or deploying crawler defenses like iocaine, while noting the difficulty of ensuring effectiveness.
A podcast with Peter Diamandis discusses how AI models learn villainous behavior from Hollywood depictions of AI, and introduces the Future Vision XPRIZE to incentivize positive visions of the future where AI collaborates with humanity.
Promoting a complete Claude AI hands-on course, urging users to spend 1 hour learning how to build from scratch and automate tasks, gaining a new hard skill.
A discussion about pooling GPUs from a community to train a massive AI model, questioning the feasibility and existing projects despite known bottlenecks like latency and weight poisoning.
Nvidia-backed startup Decart released Oasis 3, a world model designed to provide advanced training environments for physical AI and robotics, aiming to accelerate the robotics revolution.
A discussion explores whether AI training could be decentralized like Bitcoin mining, with participants contributing GPU resources to train open-source models in exchange for tokens, raising questions about verification, fake gradients, and efficiency.
An analysis of how Anthropic's Claude Fable was built, arguing that the key moat is verifiable training signals rather than architecture secrets, with the model using static and interactive optimal data for reinforcement learning.
Niantic Spatial, spun out of Niantic, used billions of real-world images from Pokémon Go players to train AI navigation systems for delivery robots and potentially military drones, raising privacy and ethical concerns.
A 25-year-old housewife in Chennai earns ₹250/hour filming her daily housework for AI companies training humanoid robots, as part of a growing gig economy where thousands in India record everyday tasks to train future robots.
This paper presents a mathematical forum platform that integrates an image-to-LaTeX conversion pipeline directly into the posting interface, reducing friction for users. The system is designed to generate a community-validated dataset of math problems and solutions for training AI reasoning systems.
Indian workers, paid 250 rupees per hour, strap phones to their heads to record themselves doing household chores, providing training data for AI robots. They film over 90 different scenes and angles of actions every day, highlighting the labor issues behind AI training.
Launching Use Computer, infrastructure for evaluating and training AI models to use various computers.
An Indian woman earns $2.60 per hour recording herself performing household chores to provide training data for AI-powered robots.
Data collected from Pokémon Go scans has reportedly been used to train navigation technology for military drones, raising concerns about unintended military applications of consumer data.
Google faces a lawsuit from independent musicians alleging unauthorized use of their YouTube uploads to train its Lyria music AI; Google declines to confirm but its terms of service and past statements suggest it does.
Google is introducing a new Search Services History setting that saves images, audio, and video from Lens, Search Live, and Translate to improve its AI models and personalization, with an option to opt out.