Skip to main content
← View All Insights
MCPAIArchitectureCost

Enterprise AI Without an Enterprise Budget

2026-05-24·11 min read

Enterprise AI gets sold as something only large companies can afford. It doesn't have to be. The reason is structural: the protocol layer underneath is absorbing the work that used to require an AI platform.

Over three months at one mid-sized engineering firm (not a software company), I operationalised AI across the entire business. Nine production MCP servers covering ERP, BIM, fleet, calculations, building automation, energy, and operational logs. No platform team, no framework dependencies, no custom chat UI. Project managers, field engineers, and operations colleagues query the whole company's data in natural language, every day. The running cost is roughly €19/seat/month for the staff who use it, plus the engineering time to build the MCP layer.

This piece is about the architecture that made that reachable, and why it's reachable for small and mid-sized companies that have been told enterprise AI is out of their league.

The standard path is expensive

When a mid-market or smaller company decides to bring AI in, the path they're usually pointed toward looks the same: build a branded chat UI on the API, wire it to internal auth, manage prompts in-house, maybe add a RAG pipeline against company documents, optionally an orchestration framework for "agent workflows," and increasingly an observability platform to monitor it all. That path is real, and at sufficient scale it may be the right call. I haven't run this architecture at the scale of a large multinational with hundreds of heterogeneous systems and complex tenancy requirements, so I can't tell you whether the same posture holds there. My guess is that the MCP layer itself stretches further than the framework industry assumes (more servers, deeper hierarchies of tools, more careful schemas) rather than needing a different architecture entirely. But that's a hypothesis from one scale, not a report from another.

For a company with twenty important systems and a few hundred employees, the standard path doesn't pay back. It's expensive in three coupled ways, and pulling the three apart reveals a simpler architecture underneath.

The chat client. A branded chat UI is a frontend team, a design effort, conversation history infrastructure, attachment handling, multi-modal input support, an admin panel, model routing, prompt management. The vendor (Anthropic, OpenAI, Google) ships all of this as part of the subscription and continues to ship new affordances every quarter. Building it in-house means investing engineering hours into a commodity layer where the vendor has structural advantages no internal team can match.

The model. A custom chat UI almost always pins to a specific model version for stability. Six months later the frontier moves, and the pinned model is now a generation behind. Upgrading means re-validating every prompt and every tool, so most teams don't. Meanwhile every Claude Team subscriber got Opus 4.7 the morning it shipped, with zero engineering work.

The billing. API billing is per-token, which means the cost is a function of user behaviour that hasn't happened yet. In a workforce of 50-500 people, roughly 20% of users drive 80% of consumption once adoption stabilises. Published analyses of heavy usage put the API-vs-subscription cost ratio at 15-30x; for average users the multiple is smaller (3-10x), and for light users API can come out ahead. The relevant number isn't the average. It's the variance.

These three costs aren't independent. They flow from the same root decision: build our own chat UI, or use the vendor's. Building locks all three together. The standard path commits a smaller company to a build budget, a maintenance team, an aging model, and an unpredictable bill, all at once.

The simpler path

Use the vendor's chat client. Pay per seat. Connect your existing identity provider. Then put your engineering hours where the value compounds: in MCP servers that encode your domain.

This is the architecture in production. Employees use Claude through the standard client (web, desktop, mobile). The client authenticates against the corporate identity provider, same as every other corporate system. Through that client, employees have access to nine MCP servers covering the full operational chain. Each MCP tool is RBAC-gated against the same IdP roles that govern every other system. A field engineer sees their own time bookings; a controller sees aggregated financials; a guest user sees nothing.

What I didn't build: a chat UI. A model gateway. A prompt management platform. A vector database. A retrieval pipeline. An agent observability stack. An "AI platform" of any kind. None of those layers exist in the stack, because the subscription client provides everything above the MCP layer, and the IdP provides everything around it.

A concrete example. The most common AI project a mid-sized company is pitched is "RAG over our documents": chunk all the SharePoint or Google Drive content, embed it, build a vector store, wire it to a retrieval layer, host and maintain the whole pipeline. Even when that pipeline is cheap to build, it's a layer you have to keep alive. Re-index when documents change, re-tune when retrieval quality drops, re-permission when access rules shift, re-host when the embedding model is deprecated. Meanwhile, the Microsoft 365 and Google Workspace integrations that ship with Claude, ChatGPT, and Gemini are themselves MCP servers, published by the vendors, maintained by the vendors, with permissions inherited from the existing IdP and freshness handled upstream. The "document search" capability that makes a custom RAG pipeline sound necessary is already an MCP server you can turn on in your admin console. The choice isn't cheap vs. expensive. It's an extra layer you maintain vs. no extra layer at all. And once you stop blaming the data and start fixing the meaning layer, the case for a custom RAG pipeline gets thinner still.

That sharpens the architectural rule. MCP isn't a category that lives only inside your perimeter; it's the protocol the entire ecosystem speaks, and vendors are already publishing servers for the commodity layer: productivity suites, code hosts, ticketing systems, design tools. Your engineering investment goes into the MCP servers that only you can write: your ERP, your BIM data, your operational logs, your calculation history. The systems unique to your business that no vendor has, or will ever have, a connector for. Everything else, you consume. That's where the investment compounds, and where it doesn't.

The cost picture inverts. Claude Team is about €19/seat/month on the annual plan; ChatGPT Enterprise and Gemini for Workspace are in similar territory. The vendor eats the billing variance. Every seat gets the latest frontier model the day it ships. No frontend to maintain, no platform team to staff, no orchestration platform to buy.

What this opens up

The interesting part of this architecture, and the part that makes it reachable for small companies, is the shape of the MCP layer.

The domain layer is model-agnostic. MCP servers are typed interfaces with tool descriptions and schemas. The same servers work with Claude today, Gemini tomorrow, GPT next quarter. None of the domain logic is locked to a model vendor. The engineering investment is portable across the entire frontier-model market.

Identity lives at the tool boundary. RBAC is enforced inside the MCP server, against your existing IdP. A model that's been prompt-injected can only call tools the authenticated user is already authorised to call. The security model is the same one your company already runs for every other system. No AI-specific identity layer, no prompt firewall, no model gateway. The tool boundary is the trust boundary.

The engineering shape is small. An MCP server, in my experience, is one engineer working with one domain expert for a few weeks per domain. That's the whole staffing model. Nine production servers over three months, no platform team, no specialists. The people who own the underlying business systems can do most of the work themselves, with engineering support.

This is what makes the approach reachable. A small or mid-sized company doesn't need to hire an AI platform team to run this stack. The chat client is rented from a vendor at a price comparable to a productivity-suite seat. The MCP layer is built incrementally by the engineers and domain experts already on the payroll. There's no procurement cycle for an "AI platform," no consulting engagement to size the rollout, no infrastructure to provision.

What you rent, what you own

What you rent What you own
The chat client (Claude, ChatGPT, Gemini, your choice) The MCP servers that encode your domain
Conversation history, attachments, multi-modal UI Tool descriptions, query strategies, business logic
Enterprise SSO, audit logs, retention policies Identity-gated tool access through your existing IdP
Frontend updates, model upgrades, security patches Domain knowledge written into schemas
The model itself, latest version on day one The integration with ERP, BIM, fleet, energy, calc
Predictable per-seat billing One engineer, one domain expert, per server

What you rent is the commodity layer: the stuff vendors compete on and ship continuously. What you own is the part no vendor can build for you, because no vendor knows what your data means.

The same pattern, one floor down

The reason this is reachable goes one layer deeper than the chat client. The same logic applies to the framework layer underneath: LangChain, LangGraph, CrewAI, RAG pipelines, vector DBs, agent observability stacks. Building on top of those is one shape of architecture; building MCP servers directly against a frontier model is the same architectural posture without the additional layer.

Each MCP spec release closes another category of problem that previously required a framework layer. Tool calling, resource management, prompts, sampling, elicitation, UI primitives via MCP Apps, and enterprise identity integration on the 2026 roadmap. Each one used to live in a framework above MCP, and each one is now in the protocol itself. The framework layer isn't being argued against; it's being absorbed. Companies building on top of frameworks today are building on top of a layer the protocol is in the process of swallowing.

Both layers reward the same answer: rent the surface, own the domain. Use the vendor's client. Use the vendor's identity integrations. Use the vendor's model upgrades. Skip the framework layer. Spend your engineering on the part that's actually yours: the MCP servers that turn your operational data into something an agent can reason about.

That's what "MCP is the platform" actually looks like in practice. Not a stack you build, a perimeter you draw. Inside the perimeter: your domain, your tools, your IdP, your data. Outside: the model, the client, the vendor. The line between them is MCP.

If you're starting

Subscribe to Claude Team, ChatGPT Enterprise, or Gemini for Workspace. Connect your existing identity provider. Then pick the one system whose data your colleagues most want to query in natural language, and write one MCP server against it. Ship that. Watch them use it. Build the next one. The practitioner's guide walks the full seven-step recipe.

That's the entire starting move. No vendor selection process for an AI platform. No headcount plan for a platform team. No procurement cycle for orchestration software. A subscription, an identity wire-up, and one MCP server is enough to be in production with real users by the end of a month.

Three months of that across a real business produces what I have now: nine production servers, no framework dependencies, no custom chat UI, no API bill, no platform team, and a company that gets every frontier model upgrade for free the day it ships.

This architecture isn't a clever workaround for the moment. It's an early version of what enterprise AI is going to look like once the protocol layer finishes absorbing the platform layer above it. The companies that build this way now are getting a head start on a stack that won't look unusual in two years. It will look obvious.

Enterprise AI doesn't require an enterprise budget. It requires picking the right perimeter to draw. The full method is in the white paper.

David Golverdingen

AI Engineering & Technical Leadership

Based in the Netherlands

© 2026 David Golverdingen. All rights reserved.

Posts here are drafted with Claude and validated by the author against production experience.