A Reddit user debunks claims from Seed IQ (AGX) about solving the ARC-AGI-3 benchmark with a perfect score, arguing that refusal to submit to the Kaggle leaderboard (which allows closed-source submission) suggests a scam.
I recently came across a post about Seed IQ on this subreddit and I wanted to address it because it makes me sad that the community is turning into this. I’m currently a part of the ARC-AGI-3 competition. The benchmark is completely interactive now, designed to test actual fluid intelligence against a hidden evaluation set, and the rules are incredibly straightforward. Lately, there’s been a wave of spam posts about Seed IQ (coming from AGX) claiming their closed-source "control engine" magically solved the benchmark with a perfect score. When people ask why they aren't on the official Kaggle leaderboard, their excuse is always that ARC rules state you have to turn over your entire codebase and open-source your IP to be on the leaderboard, and that they have a billion-dollar commercial asset that they can't give away. They even said themselves that they’re willing to forgo the prize money. For awareness, you do not have to open-source to be on the leaderboard. The open-source requirement under Section 3.8 of the Kaggle rules \*only\* applies if you accept the prize money. You can submit a containerized model to run against the hidden evaluation set, get your verified score pinned to the top of the leaderboard for the whole world to see, and simply decline the cash prize nomination. Your IP is completely safe during submission. The Kaggle notebook runs entirely in an isolated, black-box environment. The public never sees your code, your weights, or your proprietary logic unless you explicitly choose to publish it. Furthermore high scores may trigger an audit to ensure that nobody cheated, but it’d be illegal to steal IP during an audit. If Seed IQ actually had a system that blew past the state-of-the-art, they could drop their compiled model into a notebook right now, prove it to the community, keep their IP completely hidden, and just walk away from the check. I’ve come across a lot of their spam posts and the founders' posts on LinkedIn, and it honestly disappoints me that adults in this industry are behaving this way. The second anyone asks for basic technical validation or points out that their Kaggle logic makes no sense, they get incredibly hostile and rude. At the end of the day, it doesn't even matter if their claims are real or not. I just think the community should be aware of how this scam operates. If a model won't run against a hidden test set, especially when the platform gives you a built-in way to completely protect your code, the claims mean absolutely nothing. We shouldn't let toxic hype and fake blockages derail a great benchmark.
Seed IQ achieves a perfect 14/14 score on ARC-AGI-3 games using an active inference, physics-driven multi-agent autonomous control engine, as shown in a behind-the-scenes video walkthrough.
Article references AGI (Artificial General Intelligence), likely a brief post or announcement about AGI progress or speculation. Limited content available beyond the title.
This paper introduces BenchJack, an automated red-teaming system that systematically audits AI agent benchmarks by identifying reward-hacking exploits. It applies BenchJack to 10 popular benchmarks, surfacing 219 distinct flaws and demonstrating that evaluation pipelines lack an adversarial mindset, with the system reducing hackable-task ratios from near 100% to under 10% on four benchmarks.
The author launches 'AI IQ', a new tool that scores frontier AI models on the human IQ scale, providing visualizations of model performance, intelligence costs, and EQ comparisons rather than standard leaderboard tables.
Andrew Ng discusses how U.S. policies are driving allies toward sovereign AI and open-source models, referencing DeepSeek, Qwen, and K2 Think as examples. He argues that open-source AI can help nations reduce reliance on U.S. technology.