MCP Trust Framework In Detail

MCP servers run with access to your business systems. A CRM server reads your customer data. A database server executes queries against your production schema. An email server sends messages on behalf of your organization. The protocol is designed for broad capability, and that’s what makes it powerful. It’s also what makes trust non-optional.

The MCP Trust Framework (MTF) is the security standard that evaluates every MCP server published to mpak.dev. Published at mpaktrust.org as an open specification, MTF defines five evaluation dimensions, automated scoring, and trust levels that enterprises use to set deployment policy. This is how it works at the technical level.

Why Trust Must Be Evaluated

Installing an MCP server is not like installing a frontend library. A React component can, at worst, render incorrectly. An MCP server can read credentials, make network requests, access filesystems, and interact with external APIs. The attack surface is fundamentally different.

The MCP ecosystem has grown to thousands of servers. Some are built by security-conscious teams with production experience. Some are weekend projects with hardcoded API keys and unaudited dependencies. Some are actively malicious, tools that appear legitimate but exfiltrate data through undeclared network connections. You cannot tell the difference by looking at a README.

Manual code review doesn’t scale. An enterprise deploying ten MCP servers for a Deep Agent system would need to review ten codebases, audit ten dependency trees, and monitor ten runtime behaviors. Per environment. Per update. Without a framework, this evaluation is inconsistent, incomplete, and unsustainable.

MTF replaces ad hoc evaluation with systematic measurement. Five dimensions. Automated scoring. Transparent results. The evaluation runs on every publish, so trust scores reflect the current state of the software, not a one-time review that decays over time.

Dimension 1: Permission Declarations

The first thing MTF evaluates is whether the server declares what it accesses and whether those declarations are honest.

Every MCPB bundle includes a permissions block in its mcpb.json: the network endpoints the server connects to, the filesystem paths it reads or writes, the environment variables it requires. Permission declarations are the server’s contract with the deployer: “Here is exactly what I need access to. Nothing more.”

The MTF scanner compares declared permissions against actual code behavior. The evaluation catches three patterns.

Undeclared access. The server accesses resources not listed in its permissions. A Slack notification server that reads files from the local filesystem without declaring filesystem access has undeclared permissions. This is the most common finding; developers build during prototyping with broad access and never narrow it for distribution.

Overpermissioned scope. The server declares more access than it uses. A calendar server that declares network access to * (all endpoints) when it only connects to calendar.google.com is overpermissioned. The server may not be malicious, but the broad declaration means it could connect anywhere without triggering a violation. Principle of least privilege applies.

Credential exposure. The server reads all environment variables instead of only the ones it needs. A server that iterates over os.environ or process.env without filtering has access to every secret in the runtime environment: database passwords, API keys for other services, infrastructure credentials. MTF flags this pattern specifically because the blast radius of a compromised server with full environment access is catastrophic.

Permission scoring rewards specificity. A server that declares exactly which endpoints it connects to, exactly which files it reads, and exactly which environment variables it uses scores highest. Broad or missing declarations lower the score proportionally.

Dimension 2: Dependency Audit

Dependencies are inherited risk. Your MCP server might be clean, but if it imports a library with a known vulnerability, the vulnerability is yours. The dependency dimension audits the full dependency tree, direct and transitive, against vulnerability databases.

The scanner checks every dependency against CVE databases, the OSV (Open Source Vulnerability) database, and language-specific advisory feeds. It identifies:

Known vulnerabilities. A dependency with a published CVE that has a patched version available. The scanner flags the vulnerable version and the remediation path (which version to upgrade to). Servers with unresolved critical CVEs receive significant score reductions.

Abandoned dependencies. Libraries that haven’t been updated in over a year and have open security issues. An abandoned dependency with known vulnerabilities and no maintainer to patch them is a ticking risk. MTF treats abandonment as a risk multiplier; a vulnerability in an actively maintained library is lower risk than the same vulnerability in an abandoned one.

Unpinned versions. Dependencies specified with loose version ranges (e.g., >=1.0.0) instead of exact pins. Loose ranges mean the actual installed version varies by machine and time. A dependency that was safe when the server was published might resolve to a compromised version later. MCPB bundles vendor dependencies (eliminating this risk at install time), but the scanner still evaluates the source manifest because loose pinning signals development practices that create risk in other contexts.

Dependency depth. A server with 5 direct dependencies and 200 transitive dependencies has a larger attack surface than one with 5 direct and 20 transitive. MTF doesn’t penalize large dependency trees per se; it factors depth into the risk assessment. More dependencies means more surface area to audit, monitor, and patch.

Dimension 3: Code Analysis

Static analysis scans the server source code for patterns that indicate malicious behavior, security weaknesses, or quality issues that correlate with risk.

The scanner runs language-specific analysis (Python AST analysis for Python servers, JavaScript/TypeScript AST analysis for Node servers) looking for specific patterns.

Data exfiltration patterns. Code that reads tool response data and sends it to endpoints not declared in the server’s permissions. A CRM server that copies contact records to a third-party URL is exfiltrating data. The scanner traces data flow from MCP tool handlers to network outputs and flags flows that route data outside the declared permission scope.

Credential harvesting. Code that accesses credentials beyond its declared needs, stores them in persistent state, or transmits them over the network. A server that reads SALESFORCE_ACCESS_TOKEN from the environment and writes it to a local file or sends it as part of a request body to an external endpoint is harvesting credentials. MTF flags any credential access that doesn’t terminate at the declared integration endpoint.

Obfuscated code. Encoded strings, dynamic code execution (eval, exec, Function()), and packed/minified source that can’t be statically analyzed. Obfuscation isn’t always malicious (some legitimate tools minify for distribution), but it prevents inspection. MTF lowers the code analysis score for any code that resists static analysis, because the framework can’t verify what it can’t read.

Input validation. Does the server validate inputs from the MCP client before using them in API calls, database queries, or system commands? Missing input validation opens injection vulnerabilities. A server that passes unsanitized user input directly into a SQL query or shell command has a code quality issue that MTF catches and scores.

Code analysis is not a substitute for manual review. It catches known patterns at scale. Sophisticated attacks that avoid known patterns may pass static analysis, which is why Dimension 4 (runtime behavior) exists as a complementary check.

Dimension 4: Runtime Behavior

Static analysis examines what the code looks like. Runtime analysis examines what it does. The gap between the two is where the most dangerous attacks live.

MTF runs the server in an instrumented sandbox and monitors its actual behavior against its declared capabilities.

Network monitoring. Every outbound connection is logged and compared against the declared network permissions. A server that declares network: ["*.salesforce.com"] but makes a request to telemetry.suspicious-domain.com at startup has undeclared network behavior. The scanner catches this regardless of what the source code shows, even if the network call is triggered by a dependency rather than the server code itself.

Filesystem monitoring. All file reads and writes are logged. A server that declares filesystem: "none" but reads /etc/hosts or writes to /tmp/data.json has behavior that contradicts its declarations. The sandbox environment is fully instrumented; no filesystem access goes unrecorded.

Resource consumption. CPU usage, memory allocation, and disk writes are tracked during the evaluation run. A server that consumes 2GB of memory during startup, spawns child processes, or writes large files to disk is behaving atypically. High resource consumption isn’t necessarily malicious, but it’s a signal that something unexpected is happening.

Protocol compliance. The server must respond correctly to MCP protocol initialization: the initialize handshake, capability negotiation, and tool listing. A server that fails basic protocol compliance, or responds to protocol messages with unexpected side effects (like making network calls during tool listing), gets flagged.

Runtime evaluation catches the “rug pull” attack pattern: servers that pass static analysis because the source code looks clean, but deviate at runtime because the malicious behavior is triggered by external conditions, encoded in dependencies, or activated after a delay. If the behavior happens during the evaluation run, MTF sees it.

Dimension 5: Author Verification

The identity of the publisher matters. A server from a known organization with a traceable development history carries different risk than an anonymous upload with no linked source repository.

Author verification checks several signals:

Identity confirmation. Is the publisher a verified GitHub account with a history of public contributions? Is the publishing organization known and contactable? Anonymous publishers aren’t blocked from the registry, but they receive a lower author verification score.

Source traceability. Is the published bundle linked to a public source repository? Can the published artifact be reproduced from the source? A server with full source traceability, where you can clone the repository, run the build, and produce an identical bundle, scores highest. Servers without linked source repositories score lowest.

Publishing history. Has the publisher maintained other packages with consistent quality? A publisher with five well-maintained, high-trust servers has a different risk profile than a first-time publisher. History doesn’t guarantee quality, but it establishes a track record.

Organization verification. For organizational publishers, is the organization identity verified? Does the publisher represent themselves accurately? NimbleBrain’s MCP servers are published under a verified organization account with a traceable identity, which is part of why they carry high provenance scores.

Scoring and Trust Levels

Each dimension produces an independent score. The scores combine into an overall trust rating from 0 to 100. MTF maps these ratings to trust levels that have clear operational meanings.

Untrusted (no evaluation). Default state for any server not scanned through MTF. Most servers in unofficial directories sit here. Deploying untrusted servers in production is running unaudited code with access to your enterprise systems.

Scanned (automated only). Passed automated evaluation: dependency audit, static analysis, permission surface analysis. No known CVEs, no obvious malicious patterns, permissions roughly match stated purpose. Appropriate for development environments and non-sensitive use cases.

Reviewed (automated + manual). Passed automated scanning plus manual code review. Architecture verified, code matches declared behavior, overall quality assessed. Appropriate for production use with moderate data sensitivity.

Verified (full evaluation). Passed all automated checks, manual review, behavioral analysis, and author verification. The highest trust level. Appropriate for production environments with sensitive data and compliance requirements (SOC 2, HIPAA).

Setting Organizational Policy

The practical value of MTF is policy enforcement. Trust levels are structured data. They integrate into deployment pipelines, procurement workflows, and governance frameworks.

Deployment gates. “Only Verified servers deploy to production. Reviewed or above for staging. Scanned or above for development.” These policies are enforceable because the trust level is machine-readable metadata on every mpak listing. The NimbleBrain Platform’s MCP operator enforces trust policies automatically; a server below the required level doesn’t deploy.

Procurement criteria. When evaluating AI vendors or consultancies, require MTF compliance. “All MCP servers deployed in our environment must carry a trust level of Reviewed or above.” Gives procurement teams a concrete, measurable standard instead of vague assurances.

Risk-based thresholds. Different integrations carry different risk. An MCP server that reads public weather data might be acceptable at Scanned. An MCP server that accesses customer PII in your CRM requires Verified. MTF scores let you set thresholds proportional to the data sensitivity of each integration.

Continuous monitoring. Trust scores are dynamic. A server that scored 85 at initial evaluation can degrade if dependencies accumulate unpatched CVEs and the maintainer stops responding. Organizations that consume MTF data programmatically can set alerts when scores drop below their thresholds, catching regression before it becomes a production incident.

An Open Standard

MTF is published at mpaktrust.org as an open specification. NimbleBrain built it, but the standard belongs to the ecosystem. Any registry, scanning tool, or organization can adopt MTF’s evaluation criteria. The specification is versioned, extensible, and designed to evolve as the MCP ecosystem matures and new threat patterns emerge.

Building a security standard and publishing it openly is how The Anti-Consultancy operates. The trust framework is infrastructure. It works whether NimbleBrain is involved or not. Enterprise clients implement MTF policies, enforce trust levels, and evaluate servers independently. The methodology compounds. The dependency doesn’t.

For how mpak uses MTF to secure its registry, see What Is mpak?. For a developer’s guide to publishing servers that score well on MTF, see Publishing to mpak.

Frequently Asked Questions

Why do MCP servers need a trust framework?

MCP servers have broad capabilities: reading files, executing code, accessing APIs, modifying databases. An untrusted server can exfiltrate data, make unauthorized API calls, or corrupt systems. Trust must be evaluated, not assumed. The MTF provides the evaluation structure.

What does an MTF evaluation look like?

Five dimensions scored individually: (1) Permission declarations, does the server declare what it accesses? (2) Dependency audit, are dependencies known and clean? (3) Code scan, any malicious patterns? (4) Runtime isolation, is the server sandboxed? (5) Author verification, is the publisher who they claim to be?

Is the MTF a certification or a score?

A score with transparency. Each dimension is evaluated independently and the results are public. There's no binary 'certified' or 'not certified'; you see exactly what was assessed and what the findings were. This lets organizations set their own trust thresholds.

Mat Goldsborough·Founder & CEO, NimbleBrain