ULTRATHINK
Solutions
← Back to The Signal
Architecture February 25, 2026

MCP Authentication in Production: Token Exchange, User Impersonation, and the Pattern Nobody Is Talking About

The MCP spec says each server needs its own audience-scoped token. Every blog post explains the requirement. None of them show how to actually build it. We did—with RFC 8693 token exchange, Okta, SpiceDB, and Snowflake. This is the implementation guide that doesn’t exist yet.

Nick Amabile
Nick Amabile
Founder & CEO

In our last post, we open-sourced fastmcp-gateway and solved MCP’s context collapse problem—replacing 150+ tool schemas with 3 meta-tools for progressive discovery. That post ended with a “what’s next” section that mentioned access control.

This is that post. And it turned out to be a much harder problem than tool discovery.

Here’s the uncomfortable truth about MCP authentication: the spec tells you what to do, but not how to do it. Section 7.3 of the MCP authorization specification is clear—each MCP server acts as a separate Resource Server with its own audience. Clients must not forward tokens issued for one resource server to another. That’s the requirement.

Now go look at every MCP authentication guide on the internet. Auth0, Stytch, Permit.io, AWS—they all explain the spec requirement. They show you how OAuth 2.1 fits into MCP. They draw architecture diagrams with boxes and arrows. What none of them show is what happens when your MCP server needs to talk to a platform like Snowflake that requires its own OAuth token scoped to the individual user making the request. That’s the gap we’re filling today.

The identity problem nobody is solving

Most MCP servers in the wild use one of two authentication patterns. Both break in production.

Pattern 1: Shared API keys

The MCP server holds a single API key or service account credential. Every request—regardless of which user initiated it—executes as the same identity. A 2025 security report found that over 53% of MCP servers rely on static secrets like API keys and personal access tokens.

This means you can’t enforce row-level security, can’t create per-user audit trails, and can’t revoke access for a single user without rotating the key for everyone. For a Snowflake MCP server, CURRENT_USER() returns the service account for every query. Your CISO will love that.

Pattern 2: Token passthrough

Pass the user’s token straight through from the client to the MCP server to the downstream platform. Simple. And it violates the MCP spec.

The spec is explicit: tokens must be audience-scoped per server. A token issued for your MCP gateway (audience: axon-gateway) should not be accepted by Snowflake (audience: https://myaccount.snowflakecomputing.com). Token passthrough conflates authentication boundaries. One compromised MCP server can replay tokens against every downstream platform. This isn’t theoretical—it’s the exact attack vector the spec was designed to prevent.

What’s needed is a third pattern: the gateway validates the user’s identity, checks their permissions, then exchanges their token for a platform-specific token scoped to the correct audience—on every request. The user’s identity flows through. The token boundaries stay clean. The downstream platform sees the real user, not a service account.

The pattern: 3-layer gateway authentication

We updated fastmcp-gateway with a hook-based middleware system. Hooks intercept every request at defined lifecycle points—on_authenticate and before_execute—letting you compose authentication, authorization, and token exchange as independent, stackable layers.

Here’s the architecture we deploy on the Ultrathink Axon™ platform:

Authentication Call Flow

Client Request

Bearer JWT — audience: axon-gateway

fastmcp-gateway

1

OIDC Authentication

→ Okta JWKS

Fetch /keys · Validate JWT signature · Extract sub, email, groups

2

Authorization

→ SpiceDB

Check mcp_tool#invoke · Check data_platform#impersonate · Both must ALLOW

3

RFC 8693 Token Exchange

→ Okta /token

subject_token=user_jwt · audience=snowflake · Returns platform-scoped OAuth token

X-User-Token: <snowflake_oauth_token>

Snowflake MCP Server

Read X-User-Token header · OAuth connect to Snowflake

Snowflake

CURRENT_USER() = nick@ultrathinksolutions.com

Hook lifecycle detail

Client Request (Bearer JWT)
  ↓
fastmcp-gateway
  ↓
Layer 1: AuthorizationHook.on_authenticate(headers)
  • Extract Bearer token from Authorization header
  • Validate JWT signature via OIDC JWKS endpoint
  • Return UserIdentity { subject, email, org_id, roles }
  ↓
Layer 2: AuthorizationHook.before_execute(context)
  • Check SpiceDB: mcp_tool:<tool_name>#invoke@user:<subject>
  • Inject X-User-Subject, X-User-Email, X-User-Roles headers
  ↓
Layer 3: TokenExchangeHook.before_execute(context)
  • Resolve target platform from tool domain
  • Check SpiceDB: data_platform:snowflake#impersonate@user:<subject>
  • RFC 8693 token exchange: user JWT → Snowflake-scoped token
  • Inject X-User-Token header with exchanged token
  ↓
Upstream Snowflake MCP Server
  • Read X-User-Token header
  • Connect with authenticator='oauth', token=exchanged JWT
  • CURRENT_USER() = nick@ultrathinksolutions.com

Each layer is independent. You can run Layer 1 alone for basic OIDC validation. Add Layer 2 for fine-grained tool permissions. Add Layer 3 only for platforms that support user impersonation. The Snowflake MCP server doesn’t implement any auth—it just reads a header. If the header is missing, it falls back to the service account. Zero auth code in the upstream server.

The header contract

The gateway communicates user identity to upstream MCP servers through a set of injected HTTP headers:

Header Source Purpose
X-User-Subject OIDC sub claim Unique user identifier
X-User-Email OIDC email claim Maps to Snowflake LOGIN_NAME
X-User-Roles Custom OIDC claim Comma-separated role list
X-User-Token Token exchange Platform-scoped OAuth token

This is an MCP security best practice that emerged from our production deployments: decouple identity propagation from authentication. The gateway handles the hard part (OIDC, JWKS, token exchange). Upstream servers just read headers. This keeps MCP servers simple, testable, and free from auth library dependencies.

RFC 8693: The token exchange nobody implements

RFC 8693 defines a standard OAuth grant type for exchanging one token for another. It’s been around since 2020. Identity providers like Okta, Azure AD, and Auth0 all support it. Yet in the MCP ecosystem, almost nobody is using it. The June 2025 MCP spec update even references Resource Indicators (RFC 8707) to prevent token misuse across servers—token exchange is the production mechanism for enforcing that boundary.

Here’s the concrete flow for a Snowflake MCP server. A user sends a request to the gateway. The gateway needs to swap their JWT (audience: the gateway) for a Snowflake-scoped token (audience: the Snowflake account URL).

The exchange request

POST /oauth2/<auth-server-id>/v1/token HTTP/1.1
Host: your-org.okta.com
Content-Type: application/x-www-form-urlencoded
Authorization: Basic <base64(client_id:client_secret)>

grant_type=urn:ietf:params:oauth:grant-type:token-exchange
&subject_token=eyJhbGciOiJSUzI1NiIs...   // User's JWT
&subject_token_type=urn:ietf:params:oauth:token-type:jwt
&audience=https://myaccount.snowflakecomputing.com
&requested_token_type=urn:ietf:params:oauth:token-type:access_token

What comes back

{
  "access_token": "eyJhbGciOiJSUzI1NiIs...",  // New token
  "token_type": "Bearer",
  "expires_in": 3600,
  "issued_token_type": "urn:ietf:params:oauth:token-type:access_token"
}

// Decoded payload of the new token:
{
  "iss": "https://your-org.okta.com/oauth2/<auth-server-id>",
  "aud": "https://myaccount.snowflakecomputing.com",  // Snowflake-scoped
  "sub": "nick@ultrathinksolutions.com",               // User identity preserved
  "scp": ["session:role-any"],                          // Snowflake role switching
  "exp": 1740528000
}

The input token is scoped to the gateway. The output token is scoped to Snowflake. The user’s identity (sub claim) is preserved. The gateway injects this exchanged token as an X-User-Token header. The Snowflake MCP server reads it and connects with authenticator=’oauth’.

The result: CURRENT_USER() in Snowflake returns nick@ultrathinksolutions.com—not a service account. Row-level security policies, access history, and audit logs all resolve to the actual human who made the request through the AI agent.

The Okta gotchas nobody warns you about

RFC 8693 is elegant on paper. The implementation has sharp edges. Here’s what we learned deploying this with Okta — the gotchas the docs don’t cover.

You need two Okta apps, not one

This is the non-obvious part. RFC 8693’s on-behalf-of pattern requires two distinct actors:

  • 01 Web Application (the “subject client”)—This is the app the user logs into. Grant types: Authorization Code + Refresh Token. This is your MCP gateway’s user-facing identity.
  • 02 API Services App (the “actor client”)—This is the machine identity that performs the exchange. Grant type: Token Exchange (you have to explicitly enable this in Okta). Its client_id and client_secret are what the gateway uses to authenticate the exchange request.

The gotcha: Okta rejects token exchanges where both parties are Web Applications. The actor must be an API Services type. We spent a day debugging invalid_grant errors before discovering this requirement buried in Okta’s docs.

Custom Authorization Server

You need a dedicated Authorization Server in Okta with the audience set to your Snowflake account URL. This is what makes the exchanged token valid for Snowflake:

Authorization Server:
  Name: "Snowflake External Auth"
  Audience: "https://<account>.snowflakecomputing.com"

Scopes:
  session:role-any    // Allows Snowflake role switching

Claims:
  email → user.email  (Include in: Access Token)

Access Policy:
  Grant types: Client Credentials, Authorization Code
  Token lifetime: 60 min (access), 7 days (refresh)

Snowflake External OAuth integration

On the Snowflake side, you create a security integration that trusts your Okta Authorization Server:

CREATE SECURITY INTEGRATION external_oauth_okta
    TYPE = EXTERNAL_OAUTH
    ENABLED = TRUE
    EXTERNAL_OAUTH_TYPE = OKTA
    EXTERNAL_OAUTH_ISSUER = 'https://<domain>.okta.com/oauth2/<server-id>'
    EXTERNAL_OAUTH_JWS_KEYS_URL = 'https://<domain>.okta.com/oauth2/<server-id>/v1/keys'
    EXTERNAL_OAUTH_AUDIENCE_LIST = ('https://<account>.snowflakecomputing.com')
    EXTERNAL_OAUTH_TOKEN_USER_MAPPING_CLAIM = 'sub'
    EXTERNAL_OAUTH_SNOWFLAKE_USER_MAPPING_ATTRIBUTE = 'LOGIN_NAME'
    EXTERNAL_OAUTH_ANY_ROLE_MODE = 'ENABLE';

The critical parameters: TOKEN_USER_MAPPING_CLAIM = 'sub' maps the JWT’s sub claim to LOGIN_NAME in Snowflake. This means your Snowflake users must have LOGIN_NAME set to their email address. ANY_ROLE_MODE = 'ENABLE' allows the session:role-any scope to switch roles per session.

Dual-mode identity: graceful degradation built in

Not every request needs user impersonation. Internal batch jobs, health checks, and admin tools may run as a service account. Our Snowflake MCP server handles both modes automatically based on what headers the gateway provides:

def resolve_connection_identity() -> ConnectionIdentity:
    ctx = get_user_context()  # Reads X-User-* headers

    if (ctx.platform_token or "").strip():
        return ConnectionIdentity(
            mode=IdentityMode.USER_IMPERSONATION,
            user_subject=ctx.subject,
            platform_token=ctx.platform_token,
        )

    return ConnectionIdentity(
        mode=IdentityMode.SERVICE_ACCOUNT,
        user_subject=None,
        platform_token=None,
    )

If X-User-Token is present, the server connects via OAuth as the user. If it’s absent, it falls back to the configured service account. This dual-mode pattern means the same MCP server works in every context—authenticated user requests, background jobs, local development—without configuration changes.

Authorization: SpiceDB for relationship-based access control

Authentication tells you who the user is. Authorization tells you what they can do. We use SpiceDB for relationship-based access control (ReBAC) at two checkpoints:

  • 01 Tool-level: mcp_tool:snowflake__query_run_query#invoke@user:<subject>—Can this user call this specific tool?
  • 02 Platform-level: data_platform:snowflake#impersonate@user:<subject>—Can this user’s identity be propagated to Snowflake?

Both checks happen before the request reaches the upstream MCP server. If either fails, the gateway returns a denial immediately. The upstream server never sees the request.

ReBAC over traditional RBAC is deliberate. Relationships like “user X can impersonate on platform Y” are more expressive than flat role assignments, and SpiceDB evaluates them in under 10ms. As the number of MCP servers, tools, and users grows, the permission model scales without exponential role explosion.

Putting it together: gateway configuration

The entire auth stack is configured through environment variables. No auth code in your MCP servers. No custom middleware. Just set the gateway environment and your upstream servers get identity propagation for free.

# OIDC Authentication (Layer 1)
OIDC_ISSUER=https://your-org.okta.com/oauth2/<auth-server-id>
OIDC_JWKS_URI=https://your-org.okta.com/oauth2/<auth-server-id>/v1/keys
OIDC_AUDIENCE=axon-gateway

# SpiceDB Authorization (Layer 2)
SPICEDB_ENDPOINT=spicedb.default.svc.cluster.local:50051
SPICEDB_TOKEN=<preshared-key>

# Token Exchange (Layer 3)
TOKEN_EXCHANGE_ENABLED=true
IDP_TOKEN_ENDPOINT=https://your-org.okta.com/oauth2/<auth-server-id>/v1/token
SNOWFLAKE_EXCHANGE_AUDIENCE=https://<account>.snowflakecomputing.com
SNOWFLAKE_EXCHANGE_CLIENT_ID=<api-services-app-client-id>
SNOWFLAKE_EXCHANGE_CLIENT_SECRET=<secret>
SNOWFLAKE_EXCHANGE_ACCOUNT=<account-locator>

Adding token exchange for a new platform means adding a few environment variables. No code changes to the gateway. No code changes to the upstream MCP server. The hook system resolves the target platform from the tool’s domain name (e.g., tools prefixed with snowflake__ route to the Snowflake exchange handler) and selects the correct credentials automatically.

Token caching

Exchanged tokens are cached by user + platform. If the same user makes 10 Snowflake queries in a session, only the first triggers a round-trip to Okta. The rest reuse the cached token until it expires. In practice, this reduces the per-request auth overhead to near-zero after the initial exchange.

Why this matters for enterprise AI

AI agents are getting access to production data platforms. Snowflake, BigQuery, Databricks—these aren’t sandboxes. When an AI agent runs a SQL query against your data warehouse, the question isn’t whether it needs authentication. The question is whether you can trace that query back to the specific human who asked for it.

Shared service accounts make that impossible. Token passthrough violates the MCP spec and creates replay attack vectors. The only pattern that satisfies the spec, enables per-user audit trails, and preserves token boundary security is token exchange. An emerging IETF standard—the Identity Assertion Authorization Grant (ID-JAG)—is now formalizing this exact pattern with enterprise IDP governance built in.

This is the same pattern we see across the AI Execution Gap. The technology exists (RFC 8693, OIDC, SpiceDB, Snowflake External OAuth). The spec defines the requirement. But the production implementation—the wiring, the gotchas, the dual-app Okta configuration, the header contract between gateway and upstream—is where the real work lives.

Production AI needs production identity. Not API keys duct-taped to MCP servers. Not tokens passed through without audience validation. Real identity propagation—from the human asking the question, through the AI agent, through the gateway, to the data platform executing the query. Every step auditable. Every token scoped. Every permission checked.

fastmcp-gateway handles the discovery layer. The hook system handles authentication, authorization, and token exchange. If you need the full stack—strategy, platform, and the Outcome Partnership to run it over time—start the conversation.

This is Part 2 of our MCP gateway series. See Part 1: Why Your AI Application Needs an MCP Gateway. For more on building production-grade AI systems, see The Modern AI Application Stack and Production Observability with Logfire.

Where the spec is heading: XAA and ID-JAG

Everything above describes the production pattern we shipped. But the standards world is catching up. The IETF’s Identity Assertion Authorization Grant (ID-JAG) draft (revision 01 at time of writing—check the datatracker for the latest) formalizes what we built—RFC 8693 exchange at the IDP—but adds an enterprise governance layer on top. The IDP doesn’t just issue tokens. It decides whether the exchange is allowed, based on app assignments and enterprise policy. The broader framework around this is called Cross-App Access (XAA).

The ID-JAG flow has two steps. First, the client performs an RFC 8693 token exchange at the IDP—exactly what our TokenExchangeHook does today—which produces an Identity Assertion (IA) JWT. The IDP evaluates whether the requesting app is authorized to act on behalf of the user for the target resource. Second, the IA is presented to the resource server via a JWT-bearer authorization grant. The IDP becomes the policy decision point—not just the token issuer.

Our implementation is forward-compatible with XAA. The RFC 8693 exchange we perform at the gateway is the first half of the ID-JAG flow. Our SpiceDB authorization layer mirrors the governance role that the IDP plays in XAA—checking whether a user has impersonate permission on a data_platform before the exchange proceeds. When XAA reaches GA in identity providers, the migration path is to move that permission check into the IDP’s app-assignment policy—not to rewrite the token exchange.

The MCP specification itself incorporated XAA as “Enterprise-Managed Authorization” in its November 2025 extension. Okta’s XAA Phase 1 is in Early Access (check Okta’s XAA page for current availability). No MCP client SDKs implement it yet. The spec is still early, but the direction is clear: enterprise identity providers will govern cross-application token exchange, and RFC 8693 is the mechanism underneath.

Track the spec at the IETF datatracker and explore the reference playground at xaa.dev.

Ready to Close the Execution Gap?

Take the next step from insight to action.

No sales pitches. No buzzwords. Just a straightforward discussion about your challenges.