maybe worth building
the blog
← the blog
June 2026 · Verdicts

Is it worth building an AI coding tool in 2026?

Short answer: yes, but only if you own the harness around the model, not the model call itself. The biggest name in the space hit $2B in revenue and still loses money on every power user.

The verdict

An AI coding tool is worth building when you own the harness: the codebase context, the test-and-regression loop, the review and deploy workflow the model plugs into. It is not worth building as a thin chat box over a model API. Cursor reached $2 billion in ARR in three years and still ran a negative 23% gross margin as of January 2026, because it pays its model bill to Anthropic, the same lab that builds the rival Claude Code. The model is rented from your competitor. The scaffolding is the only part you can own.

The model isn't the product anymore. The harness is.

For two years the bet was that a better model wins. In 2026 that stopped being true for coding tools, and it now has receipts. Benchmark work this month showed a fixed propose-and-regression-gate loop lifted held-out coding pass rates by up to 21.4 points across three different models, with nothing retrained (AlphaSignal, June 2026). The same harness made every model better. The work, and the moat, moved out of the model and into the scaffolding around it. If your coding tool's only ingredient is a model call, you're building the part that just got commoditized.

What the Cursor numbers actually say

Cursor, built by Anysphere, is the fastest B2B company on record to reach $2 billion in annualized revenue, getting there in about three years. It was valued at $29.3 billion in late 2025, and SpaceX agreed to acquire it at $60 billion in June 2026. Total success, by the headline. Here is the part the headline skips: Cursor's gross margin was roughly negative 23% as of January 2026 on about $2.7 billion in annualized revenue (Contrary Research, 2026). It loses money on heavy users because the compute they burn costs more than their subscription, and that compute is bought from Anthropic at prices Anthropic sets. A $60 billion company that loses money on its best customers tells you the value isn't where the revenue is. It's in the model bill.

Why the labs are both your supplier and your competitor

The trap is structural, and the writer Ed Zitron put it plainly in September 2025:

"Cursor sends 100% of their revenue to Anthropic, who then takes that money and puts it into building out Claude Code, a competitor to Cursor."

That is the position of almost any coding tool that resells a frontier model. You pay your largest bill to the company building the product meant to replace you. Anthropic ships Claude Code, OpenAI ships its own coding agents and paid about $3 billion to acquire Windsurf in 2025 before that deal collapsed and the team went to Google. The labs sell horizontal capability and then walk straight down into the application layer. If the model call is your product, you are funding your own eviction.

When a coding tool is worth building

Build it when the model is the easy 20% and you own the hard 80% around it. A few shapes that hold up:

  • You own the verification loop. The model proposes a change; your product runs the tests, gates on regressions, and only ships what passes. That loop is where the measured gains live, and it works no matter whose model is underneath.
  • You own the codebase context. Indexing, retrieval, and the memory of how this specific team's repo works. That compounds per customer in a way a generic chat window can't copy.
  • You sit inside a workflow that's painful to leave. The CI pipeline, the review process, the deploy, the system of record. Switching costs the team its whole setup, not just a tab.
  • You go vertical where the generalist won't. One stack, one regulated codebase, one niche framework, deep enough that Cursor and Copilot don't bother. The narrow lane is where a new entrant still has room.

When it isn't

Skip it when the model is the whole product. The tells:

  • A chat box over an API. If your tool is a prompt, the model's answer, and a nice theme, you've built a demo. The lab can ship the same thing for free, and it owns the model.
  • You resell inference with no control over the bill. Cursor's negative margin is the warning. If your unit economics depend on a price your competitor sets, you don't have unit economics.
  • The model gets better and you get redundant. If a smarter model makes your product unnecessary instead of stronger, you're betting against the one thing guaranteed to happen.

The test to run before you build

Before you write a line of code, run two checks. First, the space receipt: is real money already in this space? It is. Cursor at $2 billion ARR, a $3 billion Windsurf bid, Claude Code shipping. The space is alive and brutal. Second, the pain receipt: a real person describing the real problem. Zitron's line about funding your own competitor is the pain in one sentence, and any builder reselling a frontier model lives it on every invoice. Both receipts check out, which means the demand is real and so is the danger.

Then ask the question that decides it: if the underlying model got twice as good tomorrow, does your product get more valuable or less? A chat box over an API gets less valuable, because the model now does its job without you. A harness gets more valuable, because a smarter model running inside your test loop, your context, and your workflow ships better code through the scaffolding only you own. Build the ones where the answer is more. It's the same bar our idea engine uses to kill most of what it generates.

The AI coding tool isn't dead. The thin one is. Build the harness the model can't replace, or you'll spend three years getting to $2 billion in revenue and still run a negative 23% margin paying the lab that's building your replacement.

Related: Is it worth building an AI agent in 2026? and Is it worth building an AI wrapper in 2026?. Both come down to the same thing: do you own the 80% around the model, or just the model call?

Frequently asked questions

Is it worth building an AI coding tool in 2026?

Only if you own the harness, not the model. The model is a rented commodity, and the lab you rent it from is also your competitor. Cursor hit $2 billion in ARR in three years and still ran a negative 23% gross margin as of January 2026 because it pays its model bill to Anthropic, which builds the rival Claude Code. Build the scaffolding around the model and you win when the model gets better. Build a chat box over an API and the lab ships you for free.

Why does Cursor have negative gross margins?

Because it resells model inference it doesn't control. Cursor's gross margin was roughly negative 23% as of January 2026 on about $2.7 billion in annualized revenue. Heavy users run up compute bills larger than what their subscription pays, and that compute is bought from Anthropic at prices Anthropic sets.

Will the AI labs eat my coding tool?

If your tool is a thin layer over their API, yes. Anthropic's Claude Code and OpenAI both ship coding agents directly, and they own the model you depend on. The labs rarely build your specific harness: your codebase indexing, your review workflow, your CI gates, your integrations. Own that 80% and a smarter model makes you stronger.

What actually makes an AI coding tool defensible?

The harness, the data, and the distribution, not the model call. A fixed propose-and-regression-gate loop lifted held-out coding pass rates by up to 21.4 points across three models with nothing retrained (AlphaSignal, June 2026). The scaffolding is where the gains and the moat now live.

Is the AI coding tool market too crowded to enter?

The horizontal lane is crowded: Cursor, GitHub Copilot, Claude Code, and Windsurf all sit there, and OpenAI paid about $3 billion for Windsurf in 2025 before that deal fell apart. The open lane is vertical: a tool that owns one painful domain end to end, where you accumulate data and switching cost the generalist won't chase.

Want 100 ideas that pass this test?

The free pack: 100 AI ideas actually worth building, each with the receipts and a clear verdict. No fake MRR screenshots.

You're on the list. The 100 ideas are on the way.