@akshay_pachaar: Karpathy said something you'll regret ignoring: "We have to keep the AI on the leash. I'm still the bottleneck. I have …

X AI KOLs Following Products

Summary

Karpathy's point about keeping AI on a leash still holds even as models improve, because permissions and authorization are separate from correctness. The article demonstrates how AI-generated apps lack identity and audit, and how Retool's platform solves this by providing a governed runtime.

Karpathy said something you'll regret ignoring: "We have to keep the AI on the leash. I'm still the bottleneck. I have to make sure this thing isn't introducing bugs and that there's no security issues." He said it at YC talk last year, when the worry was reliability. The models hallucinated and made mistakes no human would, so the leash implied keeping yourself in the loop and checking the output before trusting it. The models are far better now, and the line still holds, for a reason he was not focused on back then. Even a model that writes flawless code today still has no idea who is allowed to run it. Correctness and authorization are different problems, and only correctness improves as the model improves. A perfect agent still hands a tool where anyone can do anything, because permission was never part of the task. I actually tested this in practice with Claude Code. I asked it to build a small internal tool with a button that issues account credits. It worked first try, and running it locally, the credit applied the instant I clicked. Nothing decided who was allowed to click it. The agent wrote the right logic and displayed a success notification. It never checked whether the caller had the right, whether it should pause for a human, or whether anything was logged. And this is not a bug a smarter model can outgrow because the leash was never in the code. Identity, permissions, and audit live in the system that runs the app, not in what the agent generates. To solve this, I took the exact same bundle and hosted it on @retool. The credit write that fired silently on my laptop now stopped at an approval gate, resolved to a real identity through SSO, and landed in an audit log. I wrote none of it. The app inherited the entire boundary the moment it was deployed, and the video shows the before and after. You can try it yourself here: https://fandf.co/4ofoO72 I also wrote a detailed breakdown of the whole thing in my recent article, and I worked with the team to put this together. It walks through the build, the exact moment the credit write went through on my laptop with nobody checking, and then what changed when the same app ran on Retool. It also covers why this is a property of the runtime and not something a better model fixes, which is why devs typically miss this. The article is quoted below.
Original Article
View Cached Full Text

Cached at: 06/23/26, 05:52 PM

Karpathy said something you’ll regret ignoring:

“We have to keep the AI on the leash. I’m still the bottleneck. I have to make sure this thing isn’t introducing bugs and that there’s no security issues.”

He said it at YC talk last year, when the worry was reliability. The models hallucinated and made mistakes no human would, so the leash implied keeping yourself in the loop and checking the output before trusting it.

The models are far better now, and the line still holds, for a reason he was not focused on back then.

Even a model that writes flawless code today still has no idea who is allowed to run it.

Correctness and authorization are different problems, and only correctness improves as the model improves.

A perfect agent still hands a tool where anyone can do anything, because permission was never part of the task.

I actually tested this in practice with Claude Code.

I asked it to build a small internal tool with a button that issues account credits. It worked first try, and running it locally, the credit applied the instant I clicked.

Nothing decided who was allowed to click it. The agent wrote the right logic and displayed a success notification.

It never checked whether the caller had the right, whether it should pause for a human, or whether anything was logged.

And this is not a bug a smarter model can outgrow because the leash was never in the code.

Identity, permissions, and audit live in the system that runs the app, not in what the agent generates.

To solve this, I took the exact same bundle and hosted it on @retool.

The credit write that fired silently on my laptop now stopped at an approval gate, resolved to a real identity through SSO, and landed in an audit log.

I wrote none of it.

The app inherited the entire boundary the moment it was deployed, and the video shows the before and after.

You can try it yourself here: https://fandf.co/4ofoO72

I also wrote a detailed breakdown of the whole thing in my recent article, and I worked with the team to put this together.

It walks through the build, the exact moment the credit write went through on my laptop with nobody checking, and then what changed when the same app ran on Retool.

It also covers why this is a property of the runtime and not something a better model fixes, which is why devs typically miss this.

The article is quoted below.


Ship vibe-coded apps to production

Source: https://retool.com/blog/retool-launches-react-ai-app-builder?utm_source=freeman_and_forrest&utm_medium=influencer&utm_campaign=r2_launch&utm_content=akshay_pachaar_x&rcid=701Ql000011zyUSIAY There’s more software being built inside companies today than at any point in history—and the average quality is lower than ever before. LLMs have made it possible for anyone to generate a working app on localhost in minutes. But localhost isn’t production. Most of these apps have no authentication, no data access controls, no audit trail, and no path to deployment that IT has blessed.The result is shadow IT, but faster, and at 100x the scale.

How do you let people keep building with these tools—without creating ungoverned software at enterprise scale?

We have the answer. Today, we’re launching the new Retool: a platform where you can build software wherever you want: Claude Code, Codex, Replit, Lovable, or Retool’s own editor—and ship it through a single, governed runtime that enforces authentication, permissions, and data governance automatically.

  • **Run anything.**Deploy any React app on Retool—whether it started asa prototype in Replit, a Lovable project, or a Claude Code session. On import, Retool maps the app’s data connections to resources your team has already configured and secured. No rewrite required.
  • Governed by default.Every app on Retool inherits centralized authentication, role-based access controls, and data access policies—none of it lives in the app code. Every interaction with your data is auditable. Deploy on Retool Cloud, in your VPC, or fully on-prem. The governance layer is identical everywhere.
  • **Built to operate.**Most AI-generated apps get built and forgotten. Retool handles what comes after: versioning, release management, environment promotion, access changes, and monitoring. One place to manage every app, for its entire lifecycle.

In the early 2020s,low-code and no-code platformsbet that the secret to building software faster was moving away from code. We made that bet, too. And for a long time, it was the right one. Customers built and shipped real internal tools faster than they could have written everything from scratch.

ButAI changed the economics of building. Generating code—written by models that get better every month—is now the fastest way to build custom software, and it’s accessible to a far wider group of people than low-code tooling ever was.

When we started Retool, our proprietary abstractions made building fast. But they also meant LLMs couldn’t work in our platform fluently, engineers couldn’t bring their own tools and workflows, and apps built elsewhere couldn’t run on Retool’s platform or inherit its centralized governance.

When the best way to build changed, we changed, too.

Retool’s new app builder is the fastest,most natural way to build internal software—without giving up the control, visibility, and trust teams need once an app connects to real business data.

The new builder is optimized for prompt-first workflows, but it’s also built on a new architecture: apps in Retool are now written in React on the frontend and TypeScript on the backend.

Blog post image

That means the agent is building with the same languages and patterns modern software teams already use. The result is higher-quality app generation, more expressive and customizable UIs, real backend logic, and code your team can inspect, review, and improve. Nothing is trapped in a proprietary format that only exists inside Retool.

We’ve also opened up far more input styles than before. Drag in screenshots, PDFs, design files, or spreadsheets, and the agent works from the same references your team already uses. Builders can pull in every place where work has already happened and hand it to Retool to pick up the torch and carry the project forward.

You can prompt against the whole app, or click into any individual part of the preview to scope a prompt to exactly that component. And when you want to get closer to what’s actually been built, two modes sit alongside the prompt interface.

The Code tab lets you inspect or directly edit the underlying React. Click on any element in the app preview and you’re taken to the exact line that needs to change in the code tree, without hunting through hundreds of lines of AI-generated code.

As AI-generated apps have become more common, it’s gotten harder to know what an app is actually doing with your data—and this is where most builders leave you with the least visibility. Retool isn’t just generating your frontend; we’re generating the backend, too. So instead of asking you to read query logic buried in a large codebase, we kept this layer legible by design.

In the new Data tab, every piece of logic interacting with your data sources is created as a function you can open and understand. For more complex functions with conditional logic, there’s also a visual data graph: a diagram that maps the relationships between inputs and data sources, showing you how user actions and app logic translate into operations against your data.

The surface area for how apps come into Retool has expanded significantly.

Now, teams can deploy existing React apps directly on Retool’s platform—everything fromprototypes vibe-codedin Replit, Lovable, and v0 to a project from a local repo.

On import, Retool maps the app’s data connections back to resources your team has already defined and secured. Those resources already carry the right data access rules. The app also gets Retool’s app-level permissions, so teams can control who can view, use, and manage it through the same role-based access controls they use everywhere else in Retool.

Instead of asking—or expecting—every team to figure out hosting, authentication, permissions, secrets, auditability, and data access from scratch, Retool becomes a center of excellence you can apply programmatically to every app you deploy.

A few weeks ago, we launchedRetool’s MCP serverso admins could manage their Retool environments programmatically—querying resources, inviting users, auditing access—from Claude Code, Cursor, or Codex.

Now, we’re adding app building to that set of tools. You can build and edit Retool apps directly from the agent you’re already working in, and the project context you have there comes with you instead of getting rebuilt on the other side. And, because Retool can host any existing React code, you can keep building in your favorite CodeGen tools and point them towards Retool when you’re ready to ship.

If you’re going to let people build from anywhere (which we think you should!), governance can’t be something each app implements on its own. It has to be a single layer that sits below all of them.

Every interaction with enterprise data goes through Retool’s governance layer. Data access, permissions, approved resources, environments, and deployment rules are configured by the customer and enforced consistently, regardless of how the app was built. The security doesn’t live in the AI-generated layer. It lives underneath it.

Having permissions and security live inside an individual app works when there are a handful of apps and a handful of builders. But it breaks down when software is being created across teams, tools, and skill levels. When every team has its own path to production, technical leaders lose visibility. Retool creates a single governance checkpoint.

This matters for the long tail of software maintenance, too. Most of a piece of software’s life isn’t the initial build—it’s updates, access changes, debugging, monitoring, deprecating, and keeping the app aligned with how the business changes. Retool handles all of that in one place.

The more people building software, the more critical it becomes to ensure every app is secure before it reaches production. With Retool, your team has ultimate flexibility in how you build—but you don’t have to figure out deployment alone.

We’ve partnered with leading consultancies, system integrators, and agencies who had early access to the new app builder and completed hands-on enablement to help you ship with confidence:Artefact,BitHippie,BoldTech,GxP,Lingaro,Nymbl,Sabai System,SingleWave,Sixth Generation,StackDrop,Stradia Partners,TechRebels, andParamint.

Today’s launch is about the best way to build and ship vibe-coded apps. But this is the beginning of what’s next. As we’ve come to expect, AI-assisted development is already quickly moving forward. We’re seeing agents that act on data, trigger workflows, and call systems without a human in the loop at every step.The governance problem only gets more urgentas AI gets more autonomous. The resources, permissions, and approved pathways you define in Retool now become the harness that makes more autonomous software safe to run.

We’re excited to share more soon.

Similar Articles

@akshay_pachaar: https://x.com/akshay_pachaar/status/2067646389291725258

X AI KOLs Following

AI coding agents like Claude Code can be dangerous because they generate code without considering authorization and operational safety, potentially leading to unauthorized writes like deleting production databases. The real risk is not the code quality but the lack of runtime access controls.