Brex open-sources CrabTrap, a new control layer for AI agents in production

What happened

Brex has open-sourced CrabTrap, an HTTP and HTTPS proxy designed to sit between an AI agent and the outside world. Instead of trusting the agent to call tools and APIs safely on its own, CrabTrap intercepts every outbound network request, checks it against policy, and then either allows or blocks it in real time. The idea is simple but important: once an agent has real credentials, every hallucinated or prompt-injected request can become an operational incident.

The system combines two layers of control. First, it applies static rules for known safe traffic patterns, such as specific domains, paths, or HTTP methods. If a request falls outside those rules, CrabTrap sends the request context to an LLM judge that decides whether the action matches the agent's natural-language policy. According to Brex, the company built this because existing approaches were either too narrow, too model-specific, or too custom to scale across real production agents.

What makes the release more interesting than a standard security announcement is that Brex is not talking about lab experiments. The company says it already runs CrabTrap in production with agents doing real work in its corporate environment. It also describes a policy-builder that learns from historical network traffic, an eval system that replays old requests against draft policies, and an audit trail stored in PostgreSQL. In one production use case, Brex says the LLM judge was only needed on fewer than 3 percent of requests because the common patterns quickly became static rules.

Why it matters

This matters because the enterprise AI conversation is finally moving past the question of whether agents can complete tasks, and toward the more serious question of how they can be contained when they do. It is now relatively easy to give an agent access to a browser, a shell, a file system, and a set of API keys. The hard part is making sure that access does not turn into silent data leakage, destructive API calls, or expensive side effects when the model misreads a prompt or encounters hostile input.

Most current guardrail discussions stay too close to the model layer. Teams talk about better prompts, safer tool descriptions, or provider-level policies, but that still leaves a large blind spot. If an agent can reach external services over the network, the transport layer becomes part of the security boundary. A proxy like CrabTrap matters because it operates below the framework level. It does not care whether the agent was built with LangGraph, OpenAI Agents SDK, a homegrown harness, or something else. If the agent tries to send a request, the proxy gets a vote.

That has direct implications for integration-heavy enterprise environments, which is where the real money is. The most valuable agents are not chat windows. They are invoice workflows touching ERP systems, service agents reading shared inboxes, procurement flows calling supplier APIs, and knowledge workers that move between browser sessions, internal tools, and document stores. In those settings, prompt injection is not just a funny demo failure. It can become a compliance issue, a data residency problem, or a costly operational mistake.

Laava perspective

At Laava, we see this release as confirmation that production-grade agent systems need more than a strong reasoning model. They need a disciplined action layer. Our own three-layer architecture separates context, reasoning, and action for exactly this reason. The action layer is where AI stops being interesting and starts being risky, because that is where a model can touch ERP, CRM, email, or internal knowledge systems. Guardrails at that boundary are not optional extras. They are part of the core architecture.

We also like the fact that CrabTrap treats policy as something that can be observed, tested, and improved, rather than something declared once in a prompt and then forgotten. That mindset matches how enterprise systems should be built. Real workflows evolve. New endpoints appear. Old rules become too broad. The right response is not blind trust in the model, but a loop of logging, replaying, evaluating, and tightening. That is much closer to boring engineering than to AI theater, and that is a good thing.

At the same time, a proxy is not the whole answer. It can reduce risk at the network boundary, but it does not replace process design, human approval gates, metadata quality, or deterministic integration code. An agent that should never approve a payment above a threshold or update a customer record without validation still needs those business rules enforced outside the model. The lesson is not that one new open-source tool solves agent security. The lesson is that the industry is finally building the missing operational layers around agents, and that is exactly where enterprise adoption will be won or lost.

What you can do

If you are experimenting with AI agents today, start by mapping outbound access before you add more intelligence. Which APIs can the agent call, with which credentials, and what happens if it sends the wrong request? If you cannot answer that clearly, you do not yet have a production architecture. A simple traffic inventory and approval model will usually reduce more risk than another round of prompt tuning.

From there, pick one narrow workflow and wrap it with explicit controls. Log every outbound request. Define allowlists for the predictable traffic. Add human approval for high-impact actions. If you need more flexibility, evaluate whether a transport-layer policy system like CrabTrap belongs in your stack. The goal is not to make agents feel autonomous. The goal is to make them trustworthy enough to do useful work inside real business systems.

Brex open-sources CrabTrap, a new control layer for AI agents in production

What happened

Why it matters

Laava perspective

What you can do

Determine where this affects you first for real

From news to a concrete first route