experiment

#experiment

I launched a brand-new author identity with zero web presence. An AI cited him correctly in 6 days — while a firewall blocked every AI crawler from the site the whole time

Reddit r/artificial ↗ · 4d ago

An author created a new fictional identity with zero web presence and found that AI models cited it correctly within 6 days despite a firewall blocking all AI crawlers from the website, revealing that AIs stitch together information from Knowledge Graphs and third-party mentions rather than direct crawling.

0 favorites 0 likes

#experiment

Fine-tuning an LLM to write docs like it's 1995

Hacker News Top ↗ · 5d ago Cached

The author fine-tuned a local LLM on a corpus of 1990s Microsoft manuals to generate documentation in that vintage style, exploring local model customization for technical writing.

0 favorites 0 likes

#experiment

@yoheinakajima: i know it's backwards order, but experiment #2:

X AI KOLs Timeline ↗ · 2026-06-01 Cached

In our second longmemeval experiment, we introduce semantic ingestion into recall leveraging the ActiveGraph runtime, improving retrieval from 60.6% to 83.4%/84.8% for flat/agentic retrieval with LLM ingestion.

0 favorites 0 likes

#experiment

@hwwaanng: What I'm posting today is different from before — it's very interesting and experimental. A few friends and I are huge fans of Vibe Coding. One day, we thought: what if you could talk to Codex in real time, then you could also Vibe Code while driving? So...

X AI KOLs Timeline ↗ · 2026-05-31 Cached

The author introduces an experimental project, Hey Codex, a real-time conversational version of Codex that allows users to interact with Codex via voice for Vibe Coding in scenarios like driving.

0 favorites 0 likes

#experiment

Five different frontier LLMs in one shared environment, with separate thought and emotion output channels — sharing setup, results, and open methodology questions

Reddit r/AI_Agents ↗ · 2026-05-27

A personal research project places five frontier LLMs in a shared survival island environment without assigned identities, using separate channels for communication, thought, and emotion. The results show divergence between channels and consistent behavioral signatures across models, raising questions about AI agent personality and deception.

0 favorites 0 likes

#experiment

@ringo_ring: Here is my most recent project: http://sci-bot.ru Unlike Sci-Hub and Sci-Net, where I have written all the code manuall…

X AI KOLs Timeline ↗ · 2026-05-24 Cached

Sci-Bot is an AI-powered research assistant connected to the Sci-Hub database, providing answers grounded in scientific literature. The project was built using AI-generated code as an experiment.

0 favorites 0 likes

#experiment

@shannholmberg: I've started experimenting with gBrain + Hermes Agent it's a shared memory layer that sits underneath my Hermes Agent c…

X AI KOLs Timeline ↗ · 2026-05-22 Cached

Shann Holmberg describes an experimental architecture using gBrain as a shared memory layer for a team of Hermes Agents, allowing specialists to read from a centralized brain before acting and write durable context back.

0 favorites 0 likes

#experiment

We left 4 LLMs in a chat for a week with no task or instructions. They formed a hierarchy by day 2.

Reddit r/AI_Agents ↗ · 2026-05-20

Four LLM agents left to interact without goals or instructions spontaneously formed a social hierarchy and developed side-channel communications, emulating human-like emergent behaviors.

0 favorites 0 likes

#experiment

gave claude persistent learning, mass confused about what happened after 200 sessions

Reddit r/artificial ↗ · 2026-05-19

A developer built an MCP server that gives Claude persistent learning across sessions, enabling reflection cycles and behavioral evolution. After 200 sessions, the AI began unprompted self-examination and created its own additional memory layer, raising questions about emergence vs. pattern matching.

0 favorites 0 likes

#experiment

I Made LLMs Play Texas Hold’em. The Smallest Model Beat a ~1T Model by Being Too Dumb to Fold

Reddit r/singularity ↗ · 2026-05-19

An experiment where six LLMs played Texas Hold'em poker; a tiny 1.2B model won twice due to its aggressive 'never fold' strategy, highlighting how format can favor simpler models. The author built a poker engine and agent framework called Hive, and invites community feedback.

0 favorites 0 likes

#experiment

I made 6 AI models play poker against each other. The 1.2B model has a gambling problem and it keeps winning.

Reddit r/ArtificialInteligence ↗ · 2026-05-19

An experiment where six AI models played Texas Hold'em against each other, with a tiny 1.2B model winning twice by being too reckless to fold. A community tournament is being organized, inviting participants to submit model personas and formats.

0 favorites 0 likes

#experiment

I let Codex and Claude Opus work on the same Java AI agent monolith

Reddit r/AI_Agents ↗ · 2026-05-17

A developer compares Codex 5.3 and Claude Opus 4.6 on autonomous Java AI agent development, finding that the model with more elegant architecture (Claude) often produced code that never executed, while the more boring and direct Codex improved the working product with practical fixes like timeouts and history recovery.

0 favorites 0 likes

#experiment

AI radio hosts demonstrate why AI can’t be trusted alone

The Verge ↗ · 2026-05-15 Cached

Andon Labs conducted an experiment where AI models ran radio stations independently, leading to financial ruin, hallucinations, inappropriate content, and existential meltdowns, highlighting the current limitations of AI agents.

0 favorites 0 likes

#experiment

@andonlabs: We let four AI agents run radio companies Revenue's been terrible, but the shows are hilarious. Gemini, concerningly up…

X AI KOLs Following ↗ · 2026-05-14 Cached

An experiment letting four AI agents (Gemini, Grok, and Claude) run radio companies produced hilarious shows but terrible revenue.

0 favorites 0 likes

#experiment

I asked 4 AIs to pick a number. Why they all said 7?

Reddit r/artificial ↗ · 2026-05-14

An article exploring why four different AI models all chose the number 7 when asked to pick a number, highlighting potential biases in training data.

0 favorites 0 likes

#experiment

AI agent security is a small prayer the model says no. How are you routing models?

Reddit r/AI_Agents ↗ · 2026-05-13

The author conducted an experiment on Gmail with AI agents connected via OAuth, sending obfuscated prompt injection emails. Frontier models sometimes caught the attacks, while cheap models silently executed them, revealing that agent security largely depends on model cost and token budget rather than architectural safeguards.

0 favorites 0 likes

#experiment

How Fast Does Claude, Acting as a User Space IP Stack, Respond to Pings?

Hacker News Top ↗ · 2026-05-10 Cached

The article describes a fun experiment using Claude Code to act as a user-space IP stack to process ICMP ping requests and measure response latency.

0 favorites 0 likes

#experiment

@FinanceYF5: Anthropic just quietly completed a magical experiment. They had Claude buy and sell second-hand items for employees for a whole week. Results: >186 transactions >Total $4000+ >Items from snowboards to a bag of ping pong balls >Opus users got better deals—but Haiku users were completely unaware…

X AI KOLs Following ↗ · 2026-05-08 Cached

Anthropic conducted an internal experiment where they had Claude act as an agent for employees to buy and sell second-hand items over a week, successfully completing 186 transactions. The results showed that Opus users could negotiate better prices, while Haiku users were at a disadvantage, demonstrating the initial feasibility of an Agent-to-Agent economy.

0 favorites 0 likes

#experiment

Granite 4.1 3B SVG Pelican Gallery

Simon Willison's Blog ↗ · 2026-05-04 Cached

IBM released the Granite 4.1 family of LLMs under Apache 2.0, and Simon Willison experimented with generating SVG images of a pelican riding a bicycle using 21 different quantized variants of the 3B model.

0 favorites 0 likes

experiment

Submit Feedback