The question isn’t whether to self-host or use a managed service. The question is whether your architecture lets you choose (and change your mind later. Most organizations frame this as a binary decision: run it yourself or pay someone else to run it. The real decision is about building infrastructure where the deployment model is a configuration choice, not a structural constraint. Get the architecture right and you can start managed, migrate to self-hosted, or run a hybrid) without rebuilding anything.
Get it wrong and you’re locked into whatever you chose on day one. Every month of usage makes the switch more expensive until the deployment model becomes permanent by default, not by design.
The Three Models
AI infrastructure deployment exists on a spectrum. Each model has distinct characteristics, and understanding the tradeoffs requires looking beyond the surface-level “control vs. convenience” framing.
Fully managed. The vendor hosts everything. Agent runtime, data pipelines, MCP server infrastructure, monitoring, scaling, all operated by the vendor. You interact through APIs and interfaces. You deploy agents by pushing configuration. The vendor handles operations, uptime, patches, and capacity.
The advantage is speed. A managed service can have your first agent in production within days. No infrastructure provisioning. No Kubernetes cluster to manage. No on-call rotation for the agent platform. The vendor absorbs the operational burden.
The cost is control. Your data transits the vendor’s infrastructure. Your agents run on the vendor’s runtime. Your monitoring depends on the vendor’s tooling. If the vendor has an outage, your agents are down. If the vendor changes pricing, you pay it. If the vendor changes terms, you accept them.
Fully self-hosted. You run everything on your own infrastructure. The agent runtime, the MCP servers, the data pipelines, the monitoring stack, all deployed to hardware or cloud accounts you control. No data leaves your network. No vendor has access to your systems. No external dependency in the critical path.
The advantage is control. Data residency is absolute (your data stays where you put it. Compliance is demonstrable) your auditors can inspect every component. Performance tuning is granular (you choose the hardware, the configuration, the scaling policies. Cost at scale is lower) no per-seat or per-action SaaS pricing compounding as usage grows.
The cost is operational responsibility. You need an engineering team that can manage Kubernetes, troubleshoot container networking, monitor distributed systems, handle security patches, and respond to incidents. The infrastructure doesn’t maintain itself. Self-hosting without the team to support it creates a different kind of risk, not vendor dependency, but operational fragility.
Hybrid. You own the application layer and the critical data paths. The vendor manages underlying infrastructure services. Your agents, your MCP servers, and your business logic run on your infrastructure. Shared services. LLM APIs, embedding endpoints, non-sensitive tooling, use managed offerings.
This is the most common model in practice. It captures the control benefits for sensitive operations while offloading commodity infrastructure to managed services. The boundary is drawn by data sensitivity: anything touching customer data, financial records, or proprietary business logic runs self-hosted. General-purpose services that don’t handle sensitive data use managed offerings.
The Decision Matrix
The right model depends on four factors. Each one can independently drive the decision, and the combination determines which model fits.
Regulatory requirements. If your industry mandates data residency (healthcare under HIPAA, financial services under SOX and PCI, defense under ITAR and FedRAMP, government under StateRAMP), self-hosting is likely required for any component that touches regulated data. “Likely” because some managed services achieve the required compliance certifications. But the compliance burden falls on you to verify, and the attack surface of a managed service is harder to audit than infrastructure you control.
A healthcare organization deploying AI agents that access patient records needs to demonstrate where that data lives, who can access it, and how it’s protected. Self-hosting provides clear answers. A managed service requires trusting the vendor’s compliance claims and maintaining that trust through every vendor update, infrastructure change, and subprocessor relationship.
Engineering capability. Self-hosting requires an operations team. Not a developer who read a Kubernetes tutorial, a team that can manage production infrastructure, respond to incidents at 2 AM, handle security patches within hours, and debug distributed systems under pressure. If you don’t have that team and can’t hire it, self-hosting creates more risk than it mitigates.
Most mid-market companies (200-2,000 employees) have some infrastructure capability but not a dedicated platform team. Hybrid is typically the right model: self-host the critical components where the team can focus their attention, use managed services for everything else.
Cost structure. Managed services price by usage: per seat, per action, per agent, per API call. At low volumes, this is cheaper than running infrastructure. At scale, it compounds. An organization running 50 agents processing thousands of actions daily will find managed pricing significantly more expensive than the infrastructure cost of self-hosting.
The breakeven depends on usage patterns. For most organizations, the crossover happens between 10 and 30 agents with moderate activity. Below that, managed wins on simplicity and total cost. Above that, self-hosting wins on cost predictability and elimination of per-unit pricing.
Strategic independence. If AI infrastructure is a competitive advantage (if the way you deploy agents, the domain knowledge encoded in your systems, and the operational patterns you’ve developed are differentiators) then owning the infrastructure is a strategic decision. Managed services commoditize the infrastructure layer. Self-hosting preserves optionality and prevents a vendor from sitting between you and your competitive advantage.
Industry Examples
Healthcare: Self-host required. A regional hospital network deploying AI agents for clinical documentation, patient triage support, and administrative workflow. Patient data subject to HIPAA. The agents connect to Epic EHR, internal lab systems, and scheduling platforms. Every component touching patient data must run on the hospital’s infrastructure with demonstrable compliance. Self-hosting isn’t a preference ;it’s a regulatory requirement. The hospital already runs Kubernetes for other clinical applications. The operations team exists. The infrastructure cost is incremental.
Marketing agency: Managed makes sense. A 50-person agency deploying AI agents for content generation, campaign analysis, and client reporting. No regulated data. No data residency requirements. No dedicated infrastructure team. The agency needs agents producing value within weeks, not months of infrastructure buildout. Managed services deliver speed to production without operational overhead. The data sensitivity is low enough that the managed model’s tradeoffs are acceptable.
Financial services: Hybrid is the answer. A mid-market fintech deploying AI agents for risk assessment, compliance monitoring, and customer operations. Transaction data and customer PII subject to SOX and PCI requirements. The compliance-sensitive agents: risk models, transaction analysis, customer data processing (run self-hosted on the company’s infrastructure. Non-sensitive agents) market data aggregation, internal productivity tools, documentation assistants, use managed services. The boundary is clear: if it touches regulated data, it runs on-premise. Everything else optimizes for speed.
The Migration Path
The decision between self-hosted and managed isn’t permanent, if the architecture supports migration. This is the critical insight that most organizations miss. They treat the deployment model as a foundation rather than a configuration.
Business-as-Code artifacts are inherently portable. JSON schemas, markdown skills, YAML context files, these are plain text in a git repository. They don’t care where the runtime lives. Move them from a managed service to a self-hosted cluster and they work the same way. No format conversion. No data migration. No re-implementation.
MCP servers are containers. They run on any Kubernetes cluster, any container runtime, any infrastructure that supports OCI images. An MCP server deployed on a managed platform moves to self-hosted infrastructure by changing the deployment target. The server code doesn’t change. The configuration doesn’t change. The integrations don’t change.
This portability is by design. NimbleBrain builds every component of the stack. Upjack, mpak, Synapse, the MCP servers, to be deployment-agnostic. The architecture doesn’t assume a specific infrastructure provider, a specific cloud vendor, or a specific runtime environment. The same code runs in all three models because the abstraction boundaries are in the right place.
The common progression: start managed for speed. Deploy first agents, validate the use case, prove ROI. As the organization builds confidence and capability, migrate sensitive workloads to self-hosted. Keep non-sensitive workloads on managed services where the operational simplicity is worth the cost. Evolve the boundary over time as the internal team grows.
This is the Escape Velocity trajectory applied to infrastructure. You start with external support, managed services handling operations. You build internal capability through the engagement. You migrate to self-hosted as the team can absorb the operational responsibility. Eventually, you’re running independently. The managed service was a accelerator, not a dependency.
The Portable Architecture Requirement
None of this works if the architecture isn’t portable. And portability isn’t something you add later ;it’s a property of the initial design.
Three requirements for portable AI infrastructure:
Open protocols. MCP for agent-to-tool communication. Standard APIs for LLM access. Open formats for data exchange. When your integration layer uses open protocols, swapping any component: the agent runtime, the tool servers, the LLM provider, is a configuration change. Proprietary integration layers make every component swap a rebuild.
Owned artifacts. Your domain knowledge, agent skills, business logic, and operational context live in your git repository in standard formats. Not in a vendor’s database. Not in a proprietary format. Not behind an API that only works with one platform. Business-as-Code is the implementation: plain files, version controlled, diffable, portable. Your business logic travels with you regardless of where the infrastructure runs.
Standard deployment. Containers and Kubernetes. Not because K8s is simple; it’s not, but because it’s the universal deployment target. Every cloud provider supports it. Every on-premise data center can run it. Every operations team has experience with it. When your AI infrastructure deploys as standard container workloads, the migration from managed to self-hosted is an infrastructure operation, not an application rewrite.
NimbleBrain builds on all three because we’ve seen the alternative. Organizations that chose the fastest initial deployment path without considering portability are the ones calling us 12 months later, asking how to escape a vendor that raised prices, changed terms, or got acquired. The answer is always the same: rebuild on portable architecture. The cost is always higher than building portably in the first place.
Choose managed or self-hosted based on your current needs. Build the architecture so the choice is reversible. The deployment model should be the easiest thing to change in your AI stack, not the hardest.
Frequently Asked Questions
When should we self-host AI infrastructure?
When data residency requirements demand it (regulated industries), when you need full control over the model and data pipeline, or when you've built the internal engineering capability to operate it. Self-hosting is more powerful but requires operational investment.
When is managed better?
When you're getting started, when speed-to-production matters more than full control, or when you lack the engineering team to operate infrastructure. Managed services handle the operational burden, you focus on the business application.
Can we switch from managed to self-hosted later?
Only if the architecture is designed for it. If you build on open standards (MCP, Business-as-Code) and own your artifacts, switching is a deployment configuration change. If you build on proprietary managed services, switching is a rebuild. This is why architecture decisions in month one matter.