Newest

WildTableBench: Benchmarking Multimodal Foundation Models on Table Understanding In the Wild

Hugging Face Daily Papers ↗ · 2026-05-01 Cached

WildTableBench introduces the first question-answering benchmark for real-world table images, revealing that existing multimodal foundation models struggle significantly with structural perception and numerical reasoning, with only one model exceeding 50% accuracy.

0 favorites 0 likes

Aligning Latent Geometry for Spherical Flow Matching in Image Generation

Hugging Face Daily Papers ↗ · yesterday Cached

This paper proposes aligning latent geometry for spherical flow matching, projecting latents onto a fixed-radius sphere and using spherical linear interpolation to improve image generation quality, consistently improving FID on class-conditional ImageNet.

0 favorites 0 likes

QR code generator

Simon Willison's Blog ↗ · 16h ago Cached

A QR code generator tool with customizable styling options, built with Claude's help. Supports URLs, text, and WiFi network codes.

0 favorites 0 likes

More evidence of Mythos's strength in Cybersecurity/Hacking - compared to 5.5, it got 18/41 n-day exploits, vs 1/41. Open Source/Weights models get nothing

Reddit r/singularity ↗ · 2h ago

Mythos demonstrates strong performance in cybersecurity hacking, achieving 18 out of 41 n-day exploits compared to 1 for version 5.5, while open-source models get none.

0 favorites 0 likes

@SOURADIPCHAKR18: We describe early experiments on pedagogical RL: A bitter-lesson-pilled paradigm of training privileged self-teache…

X AI KOLs Following ↗ · 22h ago Cached

Introduces pedagogical RL, a paradigm where privileged self-teachers are trained to generate correct and easy-to-follow rollouts, showing it is a relatively easy RL problem.

0 favorites 0 likes

@charles_irl: Added a fun lil widget to the LLM Engineer's Almanac -- a "Token Timing Simulator" so you can get a visceral feel for w…

X AI KOLs Following ↗ · 1h ago Cached

A token timing simulator widget was added to the LLM Engineer's Almanac, demonstrating the DFlash technique achieving ~1k TPS, to help users viscerally understand benchmark performance numbers.

0 favorites 0 likes

@0xKevin00: The more sarcastic back then, the more awkward now. The most dramatic moment at yesterday's state banquet: Lenovo CEO Yang Yuanqing and Musk had a subtle past. Ten years ago, they shared the stage on CCTV's <Dialogue>. Musk talked about Tesla not doing marketing, relying on products and word-of-mouth. Yang sneered: 'If we all stop advertising, will your industry still have food to eat...'

X AI KOLs Timeline ↗ · 12h ago

The article recounts the dramatic contrast between Musk and Yang Yuanqing ten years ago when Yang mocked Tesla's no-marketing approach, and years later, Musk's wealth far exceeds Yang's.

0 favorites 0 likes

@AnjneyMidha: live today at 12pm pst on @CS153Systems office hours with Senior White House Policy Advisor on AI @Sriramk bring your q…

X AI KOLs Following ↗ · 1h ago Cached

Live office hours event today with Senior White House Policy Advisor on AI Sriram Krishnan.

0 favorites 0 likes

@DivyanshT91162: Claude Code just crossed a dangerous line. It can now REVERSE-ENGINEER the UI of almost any website. Introducing AIDesi…

X AI KOLs Timeline ↗ · 17h ago

AIDesigner MCP v2 allows AI coding agents to reverse-engineer any website's UI, extracting branding, assets, and components to rebuild entire design systems automatically, enabling rapid cloning and redesign of elite SaaS interfaces.

0 favorites 0 likes

@RoundtableSpace: NVIDIA CEO JUST SHOWED A $249 DESKTOP AI COMPUTER THAT CAN RUN LARGE LANGUAGE MODELS LOCALLY

X AI KOLs Timeline ↗ · 13h ago Cached

NVIDIA CEO revealed a $249 desktop AI computer that can run large language models locally, making AI more accessible.

0 favorites 0 likes

@elonmusk: Most airlines are partnering with @Starlink. The others will have terrible WiFi and lose customers as a result.

X AI KOLs Following ↗ · 3h ago Cached

Elon Musk states that most airlines are partnering with Starlink for WiFi, and those that don't will have poor WiFi and lose customers.

0 favorites 0 likes

@NousResearch: Today we release Lighthouse Attention, a selection-based hierarchical attention for long-context pre-training that deli…

X AI KOLs Following ↗ · 3h ago

NousResearch releases Lighthouse Attention, a selection-based hierarchical attention that achieves 1.4-1.7x wall-clock speedup at 98K context and ~17x faster forward/backward pass than standard attention at 512K context on a single B200, validated on 530M-parameter Llama-3 models across 50B tokens.

0 favorites 0 likes

@cryptopunk7213: claude mythos just broke Apple's $2 billion defense system. it did so by discovering a completely different attack vect…

X AI KOLs Timeline ↗ · 4h ago

Claude Mythos AI discovered a novel attack vector that bypassed Apple's M5 chip defense system in five days at a cost of $35K, producing a 55-page report delivered to Apple. The exploit poisons data ingested by the chip, evading Apple's MIE system.

0 favorites 0 likes

@atmoio: The AI industry just invented a new job. Wait until you hear what it does.

X AI KOLs Following ↗ · 5h ago Cached

The AI industry has created a new job role; details are provided in the linked article.

0 favorites 0 likes

@PrajwalTomar_: Vibe coders are getting sued. People are launching apps with real users but skipping the boring stuff that can actually…

X AI KOLs Following ↗ · 6h ago

A developer with 20+ years of experience shares a pre-launch security and privacy checklist that AI app builders often skip, warning that launching without these checks creates liability.

0 favorites 0 likes

@dwarkesh_sp: New blackboard lecture w @ericjang11 He walks through how to build AlphaGo from scratch, but with modern AI tools. Some…

X AI KOLs Timeline ↗ · 4h ago

A blackboard lecture by Eric Jang walks through building AlphaGo from scratch with modern AI tools, covering RL, MCTS, self-play, and connecting to LLM training, along with a discussion on automated AI research.

0 favorites 0 likes

@rhymeleon: When I first skimmed through it, I only got a rough idea. It wasn't until I delved deeper into agents recently, combined with some questions from interviewers, that I truly realized the value of this article. The article provides in-depth explanations of agent loops, memory mechanisms, harness engineering, and agent evaluation. Highly recommended for anyone looking to get a thorough understanding.

X AI KOLs Timeline ↗ · yesterday

User recommends an article that delves into agent loops, memory mechanisms, harness engineering, and agent evaluation, highlighting its substantial value for readers who are studying agents in depth.

0 favorites 0 likes

@IndieDevHailey: The God-tier Open-source Project Bitchat by Billionaire Jack Dorsey is a Modern Black Tech! Completely Offline, Using Bluetooth to Send and Receive #Bitcoin + Chat!!! - On Planes, Concerts, Subways, Wilderness, Disaster Zones Without Internet... Real-time Communication + Transfer with No Signal at All -…

X AI KOLs Timeline ↗ · 10h ago

Jack Dorsey has open-sourced the Bitchat project, a tool that enables offline communication and Bitcoin transfers without internet, using Bluetooth Mesh networking. It supports multi-hop relay, Cashu eCash offline transfers, and double encryption, suitable for scenarios like network outages and surveillance.

0 favorites 0 likes

We compiled 42 of the Generative & Agentic AI interview questions (and how to actually answer them).

Reddit r/AI_Agents ↗ · 3h ago

The author announces a free AI Interview Prep Module inside their multi-agent workflow sandbox, listing 42 interview questions for GenAI and Agentic AI roles with standout answers.

0 favorites 0 likes

deterministic action-level attestation for ai-mediation

Reddit r/AI_Agents ↗ · 2h ago

A deterministic action-level attestation architecture for AI mediation was developed and validated in discussions with Microsoft's engineering team. The author seeks investors or partners for the software architecture.

0 favorites 0 likes

Newest

Submit Feedback