Tag
Dan Shipper interviews Edwin Chen, CEO of Surge AI, about AI progress, the potential for AGI, and the implications for human motivation and uniqueness. They discuss AI's ability to solve novel math problems, the pitfalls of optimizing for engagement, and why AI still struggles with writing.
Merve (@mervenoyann) shares day two findings of a pipeline using multiple small VLMs as judges for road sign detection, achieving map@50=0.8028 with only 1.3k examples. The thread compares model rejection rates and discusses dataset shrinking, super-specific prompts, and plans to generalize the library.
A World Bank report and documentary uncover that AI systems depend on a hidden workforce of 150-430 million low-paid data workers, often refugees or crisis-stricken populations in the Global South, who perform exploitative labeling and annotation tasks under secrecy.
The article exposes how millions of low-paid workers in developing countries perform essential data labeling and content moderation for AI models under exploitative conditions, while tech companies obscure this human labor to maintain an illusion of full automation.
Meta's three-month-old Applied AI unit, staffed by engineers forced into the role, is described as a 'soul-crushing gulag' due to tedious AI training tasks like generating puzzles and coding problems, leading to internal protests and low morale.
An Indian woman earns $2.60 per hour recording herself performing household chores to provide training data for AI-powered robots.
Gergely Orosz highlights that software engineers at Scale AI and Meta are being assigned manual data labeling tasks, a practice that new leadership at Scale AI stopped after finding it concerning.
Contract workers at Covalen, a company that provides content moderation and data labeling for Meta's AI products, protest layoffs outside Meta's Dublin office, demanding improved severance packages and an end to a six-month cooldown period.
Adala is an open-source framework for autonomous data labeling agents that learn skills iteratively through interaction with ground truth datasets and LLM runtimes.
Tack is a free, browser-based tool that allows users to mark points, draw polygons, and export coordinate data as JSON or YAML without uploading images to a server.