Eat Your Own Cooking: How NimbleBrain Uses Its Own Stack

NimbleBrain does not recommend tools it has not used in production. Every MCP server, every Upjack application, every Business-as-Code schema we deploy on client engagements runs internally first. Our operations are the proving ground. If a tool breaks, it breaks for us before it breaks for you. If a pattern is fragile, we discover the fragility in our own workflows before deploying it into yours.

This is not a philosophy statement. It is an operational fact you can verify by reading the code.

What Our Internal Stack Looks Like

NimbleBrain runs 20+ MCP servers managing daily operations. These are not demo instances or lab environments. They are production tools that the team uses every day to run the business.

CRM and lead management. MCP servers connect our AI agents to GoHighLevel for contact management, pipeline tracking, and outreach coordination. The same integration pattern (MCP server connecting agent to CRM through structured tool calls) is what we deploy for clients managing their own sales operations. When a client asks how the CRM integration handles duplicate contacts, rate limits, or field mapping edge cases, we answer from direct experience. We have encountered those problems in our own pipeline.

Email and communication. Gmail MCP servers handle email drafting, thread summarization, and follow-up tracking. Meeting summarization runs through Granola integration, converting conversation into structured action items. These are not theoretical capabilities from a vendor demo. They are tools the team uses daily, with operational logs showing real usage patterns, real failure rates, and real recovery procedures.

Content and marketing operations. The NimbleBrain website, blog content, newsletter generation, and social media coordination all run through AI agents operating on Business-as-Code artifacts. Content creation follows structured skills that encode editorial voice, brand guidelines, and publishing workflows. When we tell clients that Business-as-Code can govern content operations, we are describing our own content pipeline.

Code review and engineering workflows. CLAUDE.md skills (the same skill format we teach clients to write) govern our internal development practices. Code review standards, deployment procedures, architecture decisions, and documentation conventions are all encoded as structured skills that AI agents follow. The engineering team does not memorize process documents. They operate through agents that have the process documents encoded as executable context.

Lead generation and prospecting. MCP servers for LinkedIn Sales Navigator, Yelp data extraction, and website assessment connect to agent workflows that identify, qualify, and score prospects. The pipeline from raw lead data to qualified opportunity runs through the same agent patterns we deploy for client sales operations.

This is not a curated list of showcase examples. It is a representative sample of what actually runs. The stack is broad because the problems it solves are broad, and breadth matters. An MCP server that works for CRM integration might fail when applied to email workflows. An agent pattern that handles content generation might break when applied to lead scoring. Running the full stack internally means we encounter failures across domains, not just within the narrow scope of a single use case.

The Feedback Loop

Internal usage creates a feedback loop that no amount of external testing can replicate.

Bugs surface before client deployment. When an MCP server has a memory leak under sustained use, it shows up in our monitoring first. When a schema definition fails to handle a legitimate data variant, it fails on our data first. When an agent skill produces inconsistent results under certain context conditions, our team notices it in their daily work before any client encounters it. Every bug we fix internally is a bug no client ever sees.

The timeline matters. Traditional software companies test in staging environments that approximate production. NimbleBrain tests in actual production, our own production. The difference is that staging environments have synthetic data, predictable load patterns, and known edge cases. Real production has messy data, variable load, and edge cases nobody anticipated. Our internal usage is real production, with real operational consequences when something fails. That level of pressure-testing produces more reliable tools.

Usability issues become visible. A tool can pass every functional test and still be unusable in daily operations. Configuration that requires twelve steps. Error messages that describe the symptom but not the cause. Agent behaviors that are technically correct but operationally confusing. These issues only surface through sustained use by a team that depends on the tool to get work done.

When the NimbleBrain team encounters a usability problem with an MCP server, the fix happens the same week, because the team feeling the problem is the team that can fix it. There is no ticket filed with a vendor. No feature request queued for the next quarter. The person frustrated by the tool’s behavior walks to the codebase and improves it. That tight feedback cycle is why our tools improve faster than tools built by teams that do not use them.

Operational patterns emerge. Daily use reveals patterns that testing cannot predict. Which agent workflows get used most frequently. Which skills get refined most often. Which schemas need the most updates. Which MCP servers are the most reliable and which need the most attention. This operational intelligence, gathered from months of daily internal use, directly informs how we prioritize and deploy tools on client engagements.

We know, from our own data, that CRM integration MCP servers need the most careful error handling because CRM APIs are the most inconsistent. We know that meeting summarization agents perform best with structured output schemas constraining the format. We know that lead scoring skills need weekly refinement in the first month because the scoring criteria shift as the team learns what “qualified” actually means for their business. This is not consultant wisdom. It is operator wisdom.

The Shoemaker’s Children Problem

The technology industry has a well-known pattern: companies build tools they do not use themselves. The shoemaker’s children go barefoot. The monitoring company’s own dashboard is broken. The productivity app company uses email for internal coordination. The AI advisory firm recommends agent deployments but runs its own operations on spreadsheets and manual processes.

NimbleBrain rejects this pattern completely. If a tool is good enough to recommend to a client, it is good enough to run internally. If a tool is not good enough to run internally, it is not good enough to recommend.

This constraint is more rigorous than it sounds. It means we cannot recommend a half-finished integration. We cannot deploy a schema pattern we have not validated through months of internal operation. We cannot suggest a workflow architecture we have not stress-tested on our own work. The recommendation set is limited to what we have actually operated, which means every recommendation comes with operational data, not just architectural theory.

The Recursive Loop is visible in our own operations. We build tools. We operate them internally. We learn from the friction, the failures, and the patterns that emerge. We encode those learnings back into the tools. The tools improve. Operations improve. New patterns surface. The loop continues. By the time a tool reaches a client engagement, it has been through dozens of improvement cycles driven by real operational feedback.

Transparency as Verification

Most of NimbleBrain’s stack is open-source. The MCP servers, the Upjack framework, the mpak registry, the CLAUDE.md skill format: the code is public, the commit history is visible, and the development is ongoing. This is not selective transparency. It is structural transparency.

You can read the code that powers our internal operations. You can see the commit frequency, the issue resolution patterns, and the feature development timeline. You can verify that the tools we recommend are tools we actively maintain and improve. You can check the last commit date on any repository and see whether it was updated this week or abandoned six months ago.

This level of transparency is impossible for advisory firms that use proprietary tools or vendor platforms. They can tell you about their methodology. They cannot show you the code behind it. They can describe their operational practices. They cannot let you audit them.

NimbleBrain’s position is simple: everything we deploy on client engagements, we run internally first. Everything we run internally, you can inspect externally. The claim and the evidence are the same thing.

When you evaluate an AI advisory firm, ask them one question: do you use your own tools? Then ask them to prove it. If the answer requires a slide deck instead of a repository link, you have your signal.

Frequently Asked Questions

What does NimbleBrain's internal stack look like?

We run 20+ MCP servers for our own operations: CRM integration, email management, lead generation, content creation, meeting summarization, code review. All built on the same patterns we deploy for clients. Our internal operations are our best test environment.

Why does internal usage matter?

Two reasons. First, we find bugs before you do. If an MCP server fails under load, it fails for us first, we fix it before it reaches your deployment. Second, we understand operational reality. We know what it's like to run these systems day-to-day, not just deploy them.

Can clients see how you use your own tools?

Yes. Most of our stack is open-source. You can read the code, see the patterns, and verify that we practice what we preach. Transparency is not a marketing claim; it's an auditable fact.

Mat Goldsborough·Founder & CEO, NimbleBrain