The ROI Reality Check: AI Unit Economics and the Moat Problem

Why AI products need hard ROI, healthy unit economics, and real defensibility, and how coding agents are reshaping the build-vs-buy line.

Takeaway: A widely used AI feature is not the same as a profitable one. A clever AI feature is not the same as a defensible one.

The first one bleeds margin. The second one gets cloned.

Most AI products fail one of those two tests. Many fail both. They get usage. They generate excitement. And then someone in finance asks the question nobody wants to answer: what does it cost us to complete one of these workflows, and what does the customer actually pay for it?

This is the third real challenge in AI-native adoption: the economics between what the feature does and what the business gets to keep.

The Signal

The market is starting to separate the AI bets that compound from the ones that just burn.

BCG’s work on the widening AI value gap found only about 5% of companies were achieving meaningful, scaled AI value, while roughly 60% reported little or no material financial impact despite real investment. The differentiator was less about who used AI and more about who redesigned operations around it. (BCG, 2025)

Gartner predicted that over 40% of agentic AI projects will be canceled by the end of 2027, citing escalating costs, unclear business value, and inadequate risk controls. (Gartner, 2025)

Meanwhile, coding agents are quietly redrawing the build-vs-buy line. Retool’s annual build-vs-buy research has shown a steady drift toward in-house development as AI-assisted coding gets more capable, and Gartner’s market guidance has flagged enterprise AI coding agents as a category that will materially change how internal software gets made. (Retool, 2026; Gartner)

On the pricing side, Bessemer’s AI pricing playbook makes the same point in a friendlier tone: seat-based pricing was built for a world where each additional user cost the vendor almost nothing. AI broke that assumption, and the pricing model has to change with it. (Bessemer, 2026)

Add it up: most AI projects are not producing durable value, many will be killed for cost, the customer’s build option is getting better every quarter, and the dominant pricing model is misaligned with the underlying economics.

That is the ROI reality check.

The Buyer Shift

A few years ago, a CTO at a 200-person company had a clear default: buy the platform, configure it, move on. Building internal tools meant engineering time, security review, ongoing maintenance, and an opportunity cost no one wanted to defend.

That default is cracking.

A friend of mine runs sales engineering at a B2B SaaS vendor in an adjacent space. He told me about a deal they lost last quarter that he is still thinking about.

The buyer was a mid-market company, deep into a procurement cycle for his product — somewhere north of $90K per year for the seats they needed. The eval went well. The pilot went well. Legal was almost done.

Then, halfway through procurement, one of the buyer’s engineers spent a weekend wiring up the core workflow with Claude Code, a couple of MCP servers, and their internal database.

It was rough. It was 30% of what the vendor offered. But it was the 30% they actually used.

The deal did not close.

My friend was not bitter about it. He was unsettled. His product was better. The customer agreed his product was better. The question the buyer asked was new, and the answer was no longer obvious:

“Why buy the whole platform if my team can build the 30% we actually need with coding agents?”

That question will get asked more often, in more rooms, by more buyers. BCG’s AI-first SaaS analysis frames the same shift in less direct language: the products that survive will be the ones offering something a coding agent cannot rebuild in a weekend. (BCG, 2026)

If your product is mostly a wrapper around a model plus a UI, the buyer is increasingly capable of building that themselves.

Activity Is Not Value

Most AI dashboards measure the wrong things.

prompts run
tokens consumed
summaries generated
agents deployed
copilot seats issued

These are activity metrics. They prove the feature exists and gets used. They prove nothing about value.

Durable AI value connects to a number finance already cares about:

cycle time reduced (days to hours)
support cost per ticket
conversion or retention lift
gross margin per customer
revenue per employee
onboarding or implementation time
error rate reduction

If you cannot draw a line from the AI workflow to one of these, you have a feature, not a business case. McKinsey’s State of AI work has been consistent on this: high performers redesign workflows and measure outcomes; everyone else measures usage. (McKinsey, 2025)

A useful rule: if your AI metric does not eventually show up on a P&L, an operations dashboard, or a customer renewal slide, it is probably activity dressed up as progress.

Not to say that speeding up engineering time or generating summaries is not valuable. It can be. But the value is not in the activity itself. The value is in what that activity enables — faster decisions, cheaper support, more sales, better retention.

Pro Tip: Map your AI features to the business outcomes and put a dollar value on them.

The Margin Trap - Cost-to-Serve Problem

AI Cost-to-Serve: What is the cost of delivering the AI feature to the customer?

This is where popular AI features quietly go negative.

A normal SaaS feature had near-zero marginal cost. One more user, one more action — the vendor barely notices.

Traditional SaaS pricing models were built on that assumption. AI features do not behave that way.

A single “successful” customer feature includes:

a long-context model call
a retrieval step over embeddings
a reranker pass
one or two tool calls
a validation or self-check call
a retry when the first attempt was malformed
an eval and observability trace
occasional human review

That is not one model call. That is a small cost stack, run thousands of times a day, against a price the customer pays once per seat per month.

It is hard to build a fixed monthly user pricing model on top of a variable cost stack that can easily run into the double digits.

Here is the shape of the trap:

The team ships an AI feature inside an existing product.
Usage explodes because the feature is genuinely useful.
The cost per active user quietly triples.
Pricing stays seat-based because changing it mid-contract is hard.
The CFO walks in with a chart.
The feature gets rate-limited, downgraded to a cheaper model, or wrapped in usage caps.
Customer experience degrades. Trust takes the hit.

The feature was not a mistake. The economics were.

The fix is operational, not philosophical:

Know the cost per completed AI-workflow, not just the cost per token.
Route by difficulty. Cheap model first, escalate only when the cheap model is not enough.
Cache aggressively. Identical prompts, retrievals, and tool outputs should not be re-paid for.
Use deterministic code wherever AI judgment is not adding value.
Cap retries and context length. Unbounded loops turn a $0.04 workflow into a $4 workflow.
Align pricing with cost shape. Seat pricing is fine for predictable usage; outcome or hybrid pricing is more honest when cost scales with consumption. Bessemer’s playbook lays out workable hybrids. (Bessemer)

A profitable AI product knows its cost per completed AI-workflow the way a SaaS company knows its CAC.

Pro Tip #1:

Instrument cost at the workflow level, not the API level.
Higher instrumentation on metrics is essential.
A direct-path from API costs to real business value improves decision-making.
Hiring professionals to optimize the cost per workflow can change the ROI game.

Pro Tip #2:

Tag every model call, retrieval, tool call, and human review with a workflow ID. Sum the total at the end of each successful outcome. The number that matters is “cost to finish this job for this customer,” not “monthly OpenAI bill.”

A Personal Example

A few months ago, I helped a portfolio team review their flagship AI feature.

The feature was loved. Usage was up and to the right. Internal demos were great. The team wanted to expand it.

We sat down and reconstructed one “successful customer outcome” from the production logs. Not a prompt. The whole workflow — trigger, retrievals, tool calls, retries, the second model call that validated the first, the eval trace, the occasional human nudge.

The fully loaded cost per outcome was roughly 9x what the team had assumed when they set the price.

At their seat price, heavy users were unprofitable. Light users were subsidizing the heavy ones. Margin was a polite fiction.

It took four weeks of work to fix. Cheaper default model. Aggressive caching on retrieval. Bounded retries. Deterministic code where the AI was adding nothing. A possible pricing change for the next renewal cohort.

The feature stayed popular. Margin moved from negative to healthy. Nothing about the customer experience got worse.

The lesson I took with me: most AI features are mispriced because nobody calculated the real cost per outcome before shipping. Not from carelessness. The cost is hidden across different bills and is truly hard to track — model, embeddings, vector DB, observability, infrastructure, human review — and how to divide it across number of users/seats is not obvious.

If you do not measure cost per outcome, you do not have a unit economics problem.

You have a unit economics blind spot, which is worse.

The Moat Problem

Even with healthy unit economics, there is a second question:

What stops someone else from doing this?

For the first wave of AI products, novelty was enough. Being early was the moat. That window is closing.

Models are converging. Agents used with MCPs and computer-control complete complex workflows that many thought were the domain of “products.” - The barrier to entry is dropping fast.

Would one of the flagship AI Agents just do the same thing with a simple wrapper and a UI?

If your product is one of these:

“AI summarizes meetings”
“AI writes emails”
“AI chats with your docs”
“AI generates SQL”
“AI drafts reports”

…your product is not defensible. You probably do not have a moat.

Defensibility moves elsewhere. The patterns that hold:

Proprietary workflow data. Logs of what got done, by whom, under what constraints. A competitor with the same model cannot rebuild your customer’s history.
Deep system integrations. Real, permissioned, audited connections to the systems where work happens. Integrations are slow to build and hard to copy.
Operational memory. What was approved last time, what failed, what the customer prefers. Memory compounds; demos do not.
Eval datasets. A curated set of real-world cases that lets you ship faster and safer than competitors guessing at quality.
Trust and accountability. Security review, audit trails, customer references, SLAs. Boring on the surface. Brutal to clone.
Distribution. Existing customer relationships, channel partners, embedded usage in critical workflows.
Process ownership. Owning a measurable business outcome end-to-end — not just a step inside it.

Notice the pattern. None of these are about the model. All of them are about the workflow, the data, the trust, and the operating layer around the model.

This is why the strongest AI-native companies are starting to look less like model companies and more like workflow infrastructure companies that happen to use AI heavily.

The AI Value Equation

Before defending an AI product or feature, run it through five questions:

Business outcome. What specific number — cycle time, support cost, conversion, retention, margin, revenue per employee — moves because of this feature, and by how much?
Cost per completed workflow. What is the fully loaded cost of one successful outcome, including inference, retrieval, retries, eval, observability, and human review?
Pricing model alignment. Does the way you charge customers track the way the feature consumes cost? If usage doubles, does revenue?
Switching cost. What real friction stops a customer from leaving — data, integrations, workflow ownership, trust, or just inertia?

If you cannot answer 1 and 2 in real numbers, the ROI case is a story.

If you fail 3, the feature leaks money the more successful it gets.

If you fail 4, the moat is rented.

The Window for New AI-Native Apps

There is still a window for new AI-native products — but it is closing fast.

My crystal ball says…

The shape that wins is narrow and specific: take a judgment-heavy, multi-tool, manual process — the kind that needed an expert switching between five systems and making careful calls — and turn it into something accurate, trustworthy, and largely automated. Wrap it in predictable end-user pricing so adoption is frictionless, while the enterprise contract reflects the real consumption shape. Make onboarding and integration into existing tools so easy that a team can go live in minutes, not weeks.

That gets you to the moat. It is not the moat.

The moat is what accumulates after: proprietary workflow data and operational memory that compound with every customer, and a trusted brand that owns the category in the buyer’s head. The wrong instinct is to defend that moat by making it hard to leave — locked data exports, proprietary formats, switching friction as policy. That works for a quarter or two, then it becomes the reason a competitor wins on principle.

Keep customers by being useful, not by being sticky.

The window rewards teams that combine all of it:

a real process to replace
a price the user can predict
an onboarding path that gets out of the way
and a moat that builds quietly in the background while everyone else is still demoing.

The Point

AI adoption gets serious when the conversation shifts from features to economics.

Not from features to “ROI” in a slide deck — from features to cost per outcome, pricing aligned to cost, and defensibility that survives a weekend with Claude Code.

The AI products that compound share a profile: a clear business outcome, a known cost per completed workflow, a pricing model that tracks the cost shape, and a moat made of workflow ownership, data, integrations, and trust — not novelty.

The ones that get canceled in 2027 share the opposite profile: popular, expensive, copyable, and proud of their activity metrics.

Activity is not value.

Usage is not margin.

Novelty is not a moat.

The workflow is the product. The economics is the business.

Related reading: