Kosuke
Talk to us
Manifesto

The Vibe Coding Blind Spot

FP
Filippo Pedrazzini
8 min read
The Vibe Coding Blind Spot

Where the Value Actually Lives

Every major AI coding tool launched in the past two years has made the same bet: help people create new software from nothing. A blank canvas. A fresh repository. An empty prompt. The entire category is optimized for the moment before a product exists.

But that moment is a tiny fraction of a product's life. The vast majority of software work happens after launch. New features on existing architectures. Bug fixes across interconnected systems. Design updates that need to respect years of accumulated logic. Performance improvements on codebases that serve real traffic.

This is the 1 to 100. It is where most engineering time goes, where most business value is created, and where most teams are struggling to move faster. Yet the entire vibe coding market is fighting over the 0 to 1.

We are focused on everything that comes after.

Code Is Free. Software Is Still Expensive.

Something fundamental shifted, AI can generate working code in seconds now. The raw material of software is no longer scarce.

But companies are still bottlenecked. Product teams still wait weeks to see their ideas in production, and backlogs compound. Engineering capacity is still the ceiling for every business that runs on software, which is every business.

The paradox is obvious once you see it: code became free, but software stayed expensive, because getting the right people aligned, moving fast, and deploying with confidence is still broken.

That is the $465 billion problem sitting inside the SaaS market today, and the tools claiming to solve it are solving the wrong half of it.

One Brain, One Terminal, One Branch at a Time

Zoom into the developer. The backlog is full of work they are capable of doing. That is not the problem. The problem is physics.

One brain. One terminal. One branch at a time. Context switching is expensive, and working on two features at once is a lie developers tell themselves to feel productive. A senior engineer on a mature product could probably close every ticket in the backlog. They just cannot close them this quarter. Or next quarter. Or the one after that.

Hire more developers and the ceiling moves, but the physics do not change. Each new engineer is still one brain on one branch. Coordination overhead grows faster than throughput. Every team that has scaled past ten engineers knows this: the tenth hire does not give you 10x output, and the hundredth hire barely moves the needle at all.

This is the hard ceiling every engineering team runs into. And it is the ceiling AI has not yet broken, because writing code faster does not matter when a human still has to own every change, one at a time.

LLMs Will Be a Commodity

The raw material of software is commoditizing in front of us. Every major lab is converging on the same capabilities. The gap between the best closed model and the best open one shrinks every quarter. And the moment research hits a plateau (and plateaus are normal in AI, we have seen many summers and winters), a new era begins. Optimization, distillation, smaller models running on cheaper hardware. Token generation becomes background noise in the cost structure of a product.

When that happens, value does not stay at the model layer. It moves up, captured by whoever bet on the right product direction and the right user experience. Cursor did not win because it had a better LLM. It won because the interaction felt native to how developers already work. The model was a commodity. The UX was not.

For the next generation of products, stop pretending the model is the moat. Assume the LLM layer is a commodity. Assume anyone can access frontier capabilities. Then ask the harder question: what does the product look like on top of that?

And here is the part the industry keeps dancing around: hallucinations are here to stay. Autoregressive models are probabilistic by design. No amount of scale has eliminated them, and no credible research path suggests they will disappear soon. If you are building on the assumption that models will one day be 100 percent reliable, you are building on a timeline that does not exist.

That is why the workflow matters more than the model. That is why code still needs to live on a branch, go through review, and ship through the same process engineers already trust. Not out of nostalgia for pull requests, but because the technology underneath is unreliable by design, and the only thing that makes it safe in production is a human in the loop at the right moment.

Why Vibe Coding Breaks in the Real World

The hype peaked in early 2025. By mid-year, traffic to major vibe coding platforms had dropped over 50 percent. People tried, hit the ceiling, and moved on.

The ceiling is predictable. A generated app looks complete to anyone who is not an engineer. The interface renders. Buttons work. Forms submit data. It feels done.

Under the surface, the foundations are missing. Authentication that lives on the client side. Databases that buckle under concurrent users. APIs with no rate limiting or error handling. No logging, no tests, no deployment automation. Code that was produced quickly but structured poorly, making future changes expensive.

These are not edge cases. They are the baseline requirements of any system that serves real users.

But the technical gaps are only half the story. The workflow is fundamentally broken.

AI-generated code still needs to follow engineering process. It needs to live on a branch. It needs to go through code review. It needs to be merged deliberately, not pushed straight to production from a chat window. The tools that skip these steps are not accelerating development. They are creating liability.

Software that lasts is built on fundamentals. Any tool that bypasses them is building on sand.

Current Tools Were Not Built for the Agent Era

Assume for a second that the model problem is solved. The code that comes out is clean. The reasoning is sharp. Generation is no longer the bottleneck.

Now try to actually run what the agent wrote.

Local dev environments are irreproducible. Every engineer has a slightly different setup that took days to get right. CI is too slow to be a feedback loop. Vercel previews give you the frontend and nothing else: no backend, no database, no real environment. Docker compose files rot the week after they are committed. Spinning up the full stack on a fresh machine is still an onboarding doc, not a command.

Every piece of dev tooling was built around one assumption: a single human, on a single machine, working on a single branch. Editors, terminals, environment managers, preview deployments. The whole stack is designed around that human workflow.

Now multiply the humans by ten agents running in parallel, and nothing scales. You cannot spin up ten isolated copies of your stack. You cannot preview ten branches at once. You cannot verify ten PRs without queues colliding. The tooling shatters the moment you try to run more than one stream of work at a time.

This is the real ceiling. Not the model. The infrastructure around the model. Agents can write the code, but they cannot run it against your codebase, which means you cannot trust it, which means the PR is useless. The environment is the blocker. Until it is solved, agents are demos.

QA Is the Ultimate Bottleneck

Follow the bottleneck forward. Code generation is now essentially free. That pushes the constraint to the next step: code review. For most teams today, review is already the chokepoint.

But that will not last. Tools like Coderabbit are already surprisingly good at automated review. AI will get there. Maybe not 100 percent coverage, but a reliable 80 percent. Good enough to stop being the constraint.

So what remains? QA.

Among all the AI capabilities being developed, browser and computer use is by far the most broken. Automated testing that actually works like a human tester is still years away from reliable.

And even when automated web testing improves, there is a deeper truth: nobody will merge pull requests that were both generated and tested entirely by agents. Not for production systems. Not when real users are on the other side.

Quality requires a human in the loop. The question is where that human adds the most value. Writing code is no longer it. Reviewing code is fading. Testing and validating that changes actually work. That is where human judgment will remain essential.

Background Agents Are Broken

Background agents sound good in theory. Assign a task, let it run, come back to a finished pull request. The problem is what happens next.

If you cannot QA a change, the pull request is useless. If you do not have a preview, you cannot QA.

Review apps were supposed to solve this. In practice, they do not. The back and forth with a Vercel preview is painful. Worse, depending on your stack, you only get a frontend preview. No backend. No database. No real environment. You can only test surface-level UI changes.

This UX is broken.

Kosuke fixes this. Every chat session runs your entire stack in an isolated sandbox. Frontend, backend, database, the full environment, live. You can preview changes from the agent in real time, regardless of complexity. No waiting for deploys. No partial previews. Real QA on real changes.

The Real Problem Nobody Is Solving

And there is another dimension to the bottleneck that the market keeps ignoring. Every company with an existing product has it. Non-technical team members like PMs, designers, and marketers generate ideas constantly. But they have no way to act on them. They write tickets. They wait. The backlog compounds. Changes that could ship in a day sit untouched for months.

Meanwhile, a new generation of AI coding tools promised to unlock software creation for everyone. Lovable, v0, Bolt.new, Replit. They delivered on that promise, but only for greenfield projects. Only for prototypes. Only for the 0 to 1.

These tools are genuinely useful for exploration and validation. Describe what you want, and AI generates a working interface in minutes. That speed matters when the goal is to learn fast.

But here is what none of them can do: work on your existing product.

Most software value lives in codebases that already exist. Products with real users, real revenue, and real complexity. Different stacks, different configurations, different deployment pipelines. You cannot spin up a generic environment and expect it to work against a production system.

That is the hard problem. And the market is ignoring it.

Kosuke: AI-Powered Contribution on Existing Codebases

Kosuke is built for the gap that nobody else is filling. Not new projects. Not prototypes. Existing products, with existing teams, on existing codebases.

Import your repository: web applications, React Native, Flutter, any stack. Kosuke replicates your environment in an isolated cloud sandbox: your full stack running, your conventions respected, your deployment setup matched. Not a generic container. Your actual codebase, running the way it runs in production.

Once the environment is solved, the handoff becomes cheap. That is where everything changes.

If you are a developer, you can finally stop being the bottleneck on your own team. Assign a ticket and walk away. An agent picks it up in its own sandbox, runs your code, iterates against a live preview, and comes back with a pull request you can actually review. Run five in parallel. Five tickets moving at once. Five PRs in review while you focus on the work that genuinely requires your judgment: architecture, ambiguous problems, the changes you would never trust to an agent. The backlog stops being bandwidth-bound.

If you are a PM, a designer, or a marketer, describe the change you want in plain language. Same workflow. The agent generates a PR against the same codebase, respecting the same conventions, reviewed by the same engineers. You stop writing tickets that sit for months. You start seeing your ideas in a preview link the same day.

Both paths converge on the same pull request. Every change lives on its own branch. Nobody bypasses code review. Engineers retain full control over what ships. The workflow is the same one your team already follows: branches, PRs, reviews, merges, with a new class of contributors feeding into it, and developers finally able to hand off the work they should not be doing by hand.

The result so far: 80 percent of AI-generated pull requests get merged. Not on toy projects. On production codebases with real engineering teams reviewing the output.

The Vision: We Will ALL Be Builders

Developers and designers are still working in silos. Not because the technology is not good enough. Because the tools are built that way. Every tool is designed for one or the other. Figma for designers. Cursor for developers. Nobody is building for product teams that want to contribute to the same codebase.

That is the gap we are filling.

We are not replacing developers. We are not skipping pull requests. Code is still the source of truth. Engineers still review everything. But now designers, PMs, and marketers can contribute too. Same codebase. Same workflow. No silos.

We will ALL be builders. Builders with different skill sets, but at the end of the day, all shipping on the same codebase. AI for speed, engineering for strength. The company that unlocks collaboration for non-developers on existing codebases will win. We intend to be that company.

But it starts with developers. Not because the other contributors do not matter. They do, and they are the reason this vision exists. But developers are the ones who have to believe it works before anyone else on the team can safely plug in. The sandbox has to be real. The PRs have to be mergeable. The workflow has to survive contact with a production codebase. Developers are the ones who find out first.

Once a developer runs five agents in parallel and ships a week of work in a day, the question changes. It is no longer "can AI write code?" It is "who else on the team should be contributing?" And the answer is: everyone.

If you are a developer with more backlog than bandwidth, we built Kosuke for you.

kosuke.ai