The MCP spec says each server needs its own audience-scoped token. Every blog post explains the requirement. None of them show how to actually build it. We did—with RFC 8693 token exchange, Okta, SpiceDB, and Snowflake. This is the implementation guide that doesn’t exist yet.
In our last post, we open-sourced fastmcp-gateway and solved MCP’s context collapse problem—replacing 150+ tool schemas with 3 meta-tools for progressive discovery. That post ended with a “what’s next” section that mentioned access control.
This is that post. And it turned out to be a much harder problem than tool discovery.
Here’s the uncomfortable truth about MCP authentication: the spec tells you what to do, but not how to do it. Section 7.3 of the MCP authorization specification is clear—each MCP server acts as a separate Resource Server with its own audience. Clients must not forward tokens issued for one resource server to another. That’s the requirement.
Now go look at every MCP authentication guide on the internet. Auth0, Stytch, Permit.io, AWS—they all explain the spec requirement. They show you how OAuth 2.1 fits into MCP. They draw architecture diagrams with boxes and arrows. What none of them show is what happens when your MCP server needs to talk to a platform like Snowflake that requires its own OAuth token scoped to the individual user making the request. That’s the gap we’re filling today.
Most MCP servers in the wild use one of two authentication patterns. Both break in production.
The MCP server holds a single API key or service account credential. Every request—regardless of which user initiated it—executes as the same identity. A 2025 security report found that over 53% of MCP servers rely on static secrets like API keys and personal access tokens.
This means you can’t enforce row-level security, can’t
create per-user audit trails, and can’t revoke access for a
single user without rotating the key for everyone. For a Snowflake
MCP server, CURRENT_USER()
returns the service account for every query. Your CISO will love that.
Pass the user’s token straight through from the client to the MCP server to the downstream platform. Simple. And it violates the MCP spec.
The spec is explicit: tokens must be audience-scoped per server. A
token issued for your MCP gateway (audience:
axon-gateway) should not be
accepted by Snowflake (audience:
https://myaccount.snowflakecomputing.com). Token passthrough conflates authentication boundaries. One
compromised MCP server can replay tokens against every downstream
platform. This isn’t theoretical—it’s the
exact attack vector the spec was designed to prevent.
What’s needed is a third pattern: the gateway validates the user’s identity, checks their permissions, then exchanges their token for a platform-specific token scoped to the correct audience—on every request. The user’s identity flows through. The token boundaries stay clean. The downstream platform sees the real user, not a service account.
We updated
fastmcp-gateway with a hook-based middleware system. Hooks intercept every request
at defined lifecycle points—on_authenticate
and before_execute—letting you compose authentication, authorization, and
token exchange as independent, stackable layers.
Here’s the architecture we deploy on the Ultrathink Axon™ platform:
Authentication Call Flow
Client Request
Bearer JWT — audience: axon-gateway
↓
fastmcp-gateway
OIDC Authentication
→ Okta JWKSFetch /keys · Validate JWT signature · Extract sub, email, groups
↓
Authorization
→ SpiceDBCheck mcp_tool#invoke · Check data_platform#impersonate · Both must ALLOW
↓
RFC 8693 Token Exchange
→ Okta /tokensubject_token=user_jwt · audience=snowflake · Returns platform-scoped OAuth token
↓
X-User-Token: <snowflake_oauth_token>
↓
Snowflake MCP Server
Read X-User-Token header · OAuth connect to Snowflake
↓
Snowflake
CURRENT_USER() = nick@ultrathinksolutions.com
Hook lifecycle detail
Client Request (Bearer JWT)
↓
fastmcp-gateway
↓
Layer 1: AuthorizationHook.on_authenticate(headers)
• Extract Bearer token from Authorization header
• Validate JWT signature via OIDC JWKS endpoint
• Return UserIdentity { subject, email, org_id, roles }
↓
Layer 2: AuthorizationHook.before_execute(context)
• Check SpiceDB: mcp_tool:<tool_name>#invoke@user:<subject>
• Inject X-User-Subject, X-User-Email, X-User-Roles headers
↓
Layer 3: TokenExchangeHook.before_execute(context)
• Resolve target platform from tool domain
• Check SpiceDB: data_platform:snowflake#impersonate@user:<subject>
• RFC 8693 token exchange: user JWT → Snowflake-scoped token
• Inject X-User-Token header with exchanged token
↓
Upstream Snowflake MCP Server
• Read X-User-Token header
• Connect with authenticator='oauth', token=exchanged JWT
• CURRENT_USER() = nick@ultrathinksolutions.com Each layer is independent. You can run Layer 1 alone for basic OIDC validation. Add Layer 2 for fine-grained tool permissions. Add Layer 3 only for platforms that support user impersonation. The Snowflake MCP server doesn’t implement any auth—it just reads a header. If the header is missing, it falls back to the service account. Zero auth code in the upstream server.
The gateway communicates user identity to upstream MCP servers through a set of injected HTTP headers:
| Header | Source | Purpose |
|---|---|---|
X-User-Subject | OIDC sub claim | Unique user identifier |
X-User-Email | OIDC email claim |
Maps to Snowflake LOGIN_NAME |
X-User-Roles | Custom OIDC claim | Comma-separated role list |
X-User-Token | Token exchange | Platform-scoped OAuth token |
This is an MCP security best practice that emerged from our production deployments: decouple identity propagation from authentication. The gateway handles the hard part (OIDC, JWKS, token exchange). Upstream servers just read headers. This keeps MCP servers simple, testable, and free from auth library dependencies.
RFC 8693 defines a standard OAuth grant type for exchanging one token for another. It’s been around since 2020. Identity providers like Okta, Azure AD, and Auth0 all support it. Yet in the MCP ecosystem, almost nobody is using it. The June 2025 MCP spec update even references Resource Indicators (RFC 8707) to prevent token misuse across servers—token exchange is the production mechanism for enforcing that boundary.
Here’s the concrete flow for a Snowflake MCP server. A user sends a request to the gateway. The gateway needs to swap their JWT (audience: the gateway) for a Snowflake-scoped token (audience: the Snowflake account URL).
POST /oauth2/<auth-server-id>/v1/token HTTP/1.1
Host: your-org.okta.com
Content-Type: application/x-www-form-urlencoded
Authorization: Basic <base64(client_id:client_secret)>
grant_type=urn:ietf:params:oauth:grant-type:token-exchange
&subject_token=eyJhbGciOiJSUzI1NiIs... // User's JWT
&subject_token_type=urn:ietf:params:oauth:token-type:jwt
&audience=https://myaccount.snowflakecomputing.com
&requested_token_type=urn:ietf:params:oauth:token-type:access_token {
"access_token": "eyJhbGciOiJSUzI1NiIs...", // New token
"token_type": "Bearer",
"expires_in": 3600,
"issued_token_type": "urn:ietf:params:oauth:token-type:access_token"
}
// Decoded payload of the new token:
{
"iss": "https://your-org.okta.com/oauth2/<auth-server-id>",
"aud": "https://myaccount.snowflakecomputing.com", // Snowflake-scoped
"sub": "nick@ultrathinksolutions.com", // User identity preserved
"scp": ["session:role-any"], // Snowflake role switching
"exp": 1740528000
}
The input token is scoped to the gateway. The output token is
scoped to Snowflake. The user’s identity (sub claim) is preserved. The gateway injects this exchanged token as
an
X-User-Token header. The Snowflake
MCP server reads it and connects with
authenticator=’oauth’.
The result: CURRENT_USER() in Snowflake returns
nick@ultrathinksolutions.com—not a service account. Row-level security policies, access
history, and audit logs all resolve to the actual human who made
the request through the AI agent.
RFC 8693 is elegant on paper. The implementation has sharp edges. Here’s what we learned deploying this with Okta — the gotchas the docs don’t cover.
This is the non-obvious part. RFC 8693’s on-behalf-of pattern requires two distinct actors:
client_id and
client_secret are what the
gateway uses to authenticate the exchange request.
The gotcha:
Okta rejects token exchanges where both parties are Web Applications.
The actor must be an API Services type. We spent a day debugging
invalid_grant
errors before discovering this requirement buried in Okta’s
docs.
You need a dedicated Authorization Server in Okta with the audience set to your Snowflake account URL. This is what makes the exchanged token valid for Snowflake:
Authorization Server:
Name: "Snowflake External Auth"
Audience: "https://<account>.snowflakecomputing.com"
Scopes:
session:role-any // Allows Snowflake role switching
Claims:
email → user.email (Include in: Access Token)
Access Policy:
Grant types: Client Credentials, Authorization Code
Token lifetime: 60 min (access), 7 days (refresh) On the Snowflake side, you create a security integration that trusts your Okta Authorization Server:
CREATE SECURITY INTEGRATION external_oauth_okta
TYPE = EXTERNAL_OAUTH
ENABLED = TRUE
EXTERNAL_OAUTH_TYPE = OKTA
EXTERNAL_OAUTH_ISSUER = 'https://<domain>.okta.com/oauth2/<server-id>'
EXTERNAL_OAUTH_JWS_KEYS_URL = 'https://<domain>.okta.com/oauth2/<server-id>/v1/keys'
EXTERNAL_OAUTH_AUDIENCE_LIST = ('https://<account>.snowflakecomputing.com')
EXTERNAL_OAUTH_TOKEN_USER_MAPPING_CLAIM = 'sub'
EXTERNAL_OAUTH_SNOWFLAKE_USER_MAPPING_ATTRIBUTE = 'LOGIN_NAME'
EXTERNAL_OAUTH_ANY_ROLE_MODE = 'ENABLE';
The critical parameters:
TOKEN_USER_MAPPING_CLAIM = 'sub'
maps the JWT’s sub claim to
LOGIN_NAME in Snowflake. This
means your Snowflake users must have
LOGIN_NAME set to their email
address. ANY_ROLE_MODE = 'ENABLE'
allows the session:role-any scope to switch roles per
session.
Not every request needs user impersonation. Internal batch jobs, health checks, and admin tools may run as a service account. Our Snowflake MCP server handles both modes automatically based on what headers the gateway provides:
def resolve_connection_identity() -> ConnectionIdentity:
ctx = get_user_context() # Reads X-User-* headers
if (ctx.platform_token or "").strip():
return ConnectionIdentity(
mode=IdentityMode.USER_IMPERSONATION,
user_subject=ctx.subject,
platform_token=ctx.platform_token,
)
return ConnectionIdentity(
mode=IdentityMode.SERVICE_ACCOUNT,
user_subject=None,
platform_token=None,
)
If X-User-Token is present, the
server connects via OAuth as the user. If it’s absent, it falls
back to the configured service account. This dual-mode pattern means
the same MCP server works in every context—authenticated user
requests, background jobs, local development—without configuration
changes.
Authentication tells you who the user is. Authorization tells you what they can do. We use SpiceDB for relationship-based access control (ReBAC) at two checkpoints:
mcp_tool:snowflake__query_run_query#invoke@user:<subject>—Can this user call this specific tool?
data_platform:snowflake#impersonate@user:<subject>—Can this user’s identity be propagated to
Snowflake?
Both checks happen before the request reaches the upstream MCP server. If either fails, the gateway returns a denial immediately. The upstream server never sees the request.
ReBAC over traditional RBAC is deliberate. Relationships like “user X can impersonate on platform Y” are more expressive than flat role assignments, and SpiceDB evaluates them in under 10ms. As the number of MCP servers, tools, and users grows, the permission model scales without exponential role explosion.
The entire auth stack is configured through environment variables. No auth code in your MCP servers. No custom middleware. Just set the gateway environment and your upstream servers get identity propagation for free.
# OIDC Authentication (Layer 1)
OIDC_ISSUER=https://your-org.okta.com/oauth2/<auth-server-id>
OIDC_JWKS_URI=https://your-org.okta.com/oauth2/<auth-server-id>/v1/keys
OIDC_AUDIENCE=axon-gateway
# SpiceDB Authorization (Layer 2)
SPICEDB_ENDPOINT=spicedb.default.svc.cluster.local:50051
SPICEDB_TOKEN=<preshared-key>
# Token Exchange (Layer 3)
TOKEN_EXCHANGE_ENABLED=true
IDP_TOKEN_ENDPOINT=https://your-org.okta.com/oauth2/<auth-server-id>/v1/token
SNOWFLAKE_EXCHANGE_AUDIENCE=https://<account>.snowflakecomputing.com
SNOWFLAKE_EXCHANGE_CLIENT_ID=<api-services-app-client-id>
SNOWFLAKE_EXCHANGE_CLIENT_SECRET=<secret>
SNOWFLAKE_EXCHANGE_ACCOUNT=<account-locator>
Adding token exchange for a new platform means adding a few
environment variables. No code changes to the gateway. No code
changes to the upstream MCP server. The hook system resolves the
target platform from the tool’s domain name (e.g., tools
prefixed with snowflake__
route to the Snowflake exchange handler) and selects the correct credentials
automatically.
Exchanged tokens are cached by user + platform. If the same user makes 10 Snowflake queries in a session, only the first triggers a round-trip to Okta. The rest reuse the cached token until it expires. In practice, this reduces the per-request auth overhead to near-zero after the initial exchange.
AI agents are getting access to production data platforms. Snowflake, BigQuery, Databricks—these aren’t sandboxes. When an AI agent runs a SQL query against your data warehouse, the question isn’t whether it needs authentication. The question is whether you can trace that query back to the specific human who asked for it.
Shared service accounts make that impossible. Token passthrough violates the MCP spec and creates replay attack vectors. The only pattern that satisfies the spec, enables per-user audit trails, and preserves token boundary security is token exchange. An emerging IETF standard—the Identity Assertion Authorization Grant (ID-JAG)—is now formalizing this exact pattern with enterprise IDP governance built in.
This is the same pattern we see across the AI Execution Gap. The technology exists (RFC 8693, OIDC, SpiceDB, Snowflake External OAuth). The spec defines the requirement. But the production implementation—the wiring, the gotchas, the dual-app Okta configuration, the header contract between gateway and upstream—is where the real work lives.
Production AI needs production identity. Not API keys duct-taped to MCP servers. Not tokens passed through without audience validation. Real identity propagation—from the human asking the question, through the AI agent, through the gateway, to the data platform executing the query. Every step auditable. Every token scoped. Every permission checked.
fastmcp-gateway handles the discovery layer. The hook system handles authentication, authorization, and token exchange. If you need the full stack—strategy, platform, and the Outcome Partnership to run it over time—start the conversation.
This is Part 2 of our MCP gateway series. See Part 1: Why Your AI Application Needs an MCP Gateway. For more on building production-grade AI systems, see The Modern AI Application Stack and Production Observability with Logfire.
Everything above describes the production pattern we shipped. But the standards world is catching up. The IETF’s Identity Assertion Authorization Grant (ID-JAG) draft (revision 01 at time of writing—check the datatracker for the latest) formalizes what we built—RFC 8693 exchange at the IDP—but adds an enterprise governance layer on top. The IDP doesn’t just issue tokens. It decides whether the exchange is allowed, based on app assignments and enterprise policy. The broader framework around this is called Cross-App Access (XAA).
The ID-JAG flow has two steps. First, the client performs an RFC
8693 token exchange at the IDP—exactly what our
TokenExchangeHook does today—which
produces an Identity Assertion (IA) JWT. The IDP evaluates whether
the requesting app is authorized to act on behalf of the user for the
target resource. Second, the IA is presented to the resource server
via a JWT-bearer authorization grant. The IDP becomes the policy decision
point—not just the token issuer.
Our implementation is forward-compatible with XAA.
The RFC 8693 exchange we perform at the gateway is the first half of
the ID-JAG flow. Our SpiceDB authorization layer mirrors the governance
role that the IDP plays in XAA—checking whether a user has impersonate
permission on a data_platform
before the exchange proceeds. When XAA reaches GA in identity providers,
the migration path is to move that permission check into the IDP’s
app-assignment policy—not to rewrite the token exchange.
The MCP specification itself incorporated XAA as “Enterprise-Managed Authorization” in its November 2025 extension. Okta’s XAA Phase 1 is in Early Access (check Okta’s XAA page for current availability). No MCP client SDKs implement it yet. The spec is still early, but the direction is clear: enterprise identity providers will govern cross-application token exchange, and RFC 8693 is the mechanism underneath.
Track the spec at the IETF datatracker and explore the reference playground at xaa.dev.
Take the next step from insight to action.
No sales pitches. No buzzwords. Just a straightforward discussion about your challenges.