Context Engineering for DevOps AI Agents

Most AI agent setups that disappoint their teams don’t disappoint because the model is wrong. They disappoint because the agent was asked to reason about systems it can’t see.

A triage agent without PagerDuty access produces a vague analysis. An on-call agent without metrics hallucinates a root cause from alert titles. The agent isn’t bad; it’s undercontexted.

Context engineering is the long game, and it has a structure.

Specifically, it has four techniques — not a staircase. Each one matches a different shape of context source, and which ones apply depends on what you already have. A team with a CLI-heavy internal stack will spend most of its effort on technique 3. A team whose vendors all expose public MCPs might never touch technique 2. What remains true, regardless, is that technique 4 — bundling — is what turns any subset of the others into a team asset.

At a glance

Technique	What you do	Addresses
1	Use existing public MCPs	Vendor context, quick wins
2	Build your own MCP	Internal APIs that nobody else has wrapped
3	Teach the agent to use existing CLIs	Tools that don’t justify a full MCP
4	Bundle everything as a team plugin	An agent setup that only works on one laptop isn’t a team asset

Techniques 1–3 are the ones to pick from based on your context sources. Technique 4 applies whenever a team — rather than an individual — is the user.

Technique 1: use existing public MCPs

Start with what’s already written. Most popular vendors now expose an MCP server: PagerDuty, Grafana, GitHub, Linear, Slack, and more. For these, context engineering is a package install and a config entry.

This technique produces high leverage per minute invested. An on-call agent with just PagerDuty and a metrics MCP already outperforms a model with no tools, because the previous bottleneck was “the agent can’t see the alert it’s being asked about.”

Caveats worth knowing before you start:

Public MCPs are designed for general use, which means they sometimes expose too much or too little for a specific workflow
Auth stories vary wildly across vendors; budget time for this, not the install

When technique 1 is enough: if the agent only needs vendor-managed data (tickets, alerts, metrics, repos), this one technique covers a lot of ground.

Technique 2: build your own MCP

The ceiling of technique 1 is “the public internet has an MCP for this.” That ceiling shows up fast for anything internal — your admin portal, your internal billing API, your custom deployment tool, your service registry. Nobody is going to write MCPs for these.

So you write them. Two years ago, this would have been a week of integration work per service. Today it’s 30 minutes with an LLM.

This cost collapse is the single biggest reason technique 2 is practical today. It wasn’t practical when each integration cost a week. It is practical when each one costs thirty minutes.

Design notes that matter:

Default to read-only. An MCP that can only query is safe to give the agent at full trust. An MCP that can write needs a confirmation flow, scope narrowing, or both.
Validate inputs. If the tool takes a hostname, validate it against an inventory list. This matters more the more powerful the tool is. A tool that runs arbitrary shell on any host is a security incident waiting to be discovered by an agent’s bad day.

When technique 2 is enough: most of your internal systems. This technique typically closes 70–80% of the remaining context gap after technique 1.

Technique 3: teach the agent to use existing CLIs

Not every tool deserves an MCP. Some internal CLIs are simple, rare, or niche enough that wrapping them in a server is overkill.

The lighter alternative: document the CLI well enough that the agent can just run it. A short skill or instruction that says, “when you need to look up a resource, run our-cli lookup <name> and parse the JSON output,” gives the agent access without the overhead of a server.

This is the pattern coding agents already use for most internal tooling: a bash tool plus knowledge of which commands exist. It’s less structured than MCP, but that’s the point. For one-off or low-frequency tools, the structure isn’t worth the setup.

When technique 3 is the right call:

The CLI is simple and stable.
Output is already machine-parseable (JSON, TSV).

When technique 3 is enough: for a long tail of internal utilities that would otherwise clutter technique 2 with one-off MCP servers.

Up through technique 3, everything works — on your laptop. When someone else on the team tries to onboard, they have to:

Install several MCP servers
Configure auth for each
Clone a CLI wrapper
Replicate your skill files

Most people stop halfway. What worked for you dies as soon as it tries to scale.

Technique 4 packages the entire setup — MCPs, CLI wrappers, skills — into a single installable plugin. One install command, one auth flow, the whole team on the same context layer.

Version the plugin. Context layers drift. If the team is on v2 and you’re on v1, the agents give different answers, and debugging that is miserable.
Make secrets the only manual step. Everything else should be automated. If onboarding is “run install, paste your API key,” people use it. If it’s “follow these fourteen steps,” they don’t.

When technique 4 matters: always, once more than one person on the team is using the agent.

The emergent property

Each technique addresses a specific context gap, but the interesting thing happens once bundling is in place: context becomes a team asset, not a personal trick.

A new engineer joining the team installs the plugin, and their AI agent immediately has access to the same observability surface as the senior engineers. Seven months of accumulated integration work transfer in a single install. The new engineer’s agent, on day one, is about as well-contexted as the agent of the person who wrote the plugin.

This is a different kind of leverage than individual prompt tuning. Prompt engineering compounds within a session and disappears at the next one. Context engineering compounds across the team and persists indefinitely.

That’s the payoff of bundling specifically. The techniques that address your context gaps make your work faster. Bundling extends that speedup to everyone else, and keeps doing so after you’ve moved on.

Closing

Prompt engineering is the thing people optimize when they haven’t yet realized the problem is context. Context engineering is the thing they optimize once they do.

The four techniques are the same for everyone:

Use what already exists (public MCPs)
Build what doesn’t yet exist (custom MCPs)
Teach the agent what’s already installed (CLIs)
Ship the whole thing (bundle + share)

Apply the ones that match what you have. Not every team needs every technique — what’s worth doing for everyone is technique 4, because that’s what turns any subset of the others into something the team shares. Once that layer exists, the question isn’t “how do I make my agent smarter?” It’s “what else should the team’s agent see?” — which is a much better question to be answering.

At a glance#

Technique 1: use existing public MCPs#

Technique 2: build your own MCP#

Technique 3: teach the agent to use existing CLIs#

Technique 4: bundle and share#

The emergent property#

Closing#