Authenticating Consumers

Introduction

Envoy AI Gateway authenticates every inference request at the edge and propagates the caller's identity to downstream policies. Authentication is configured with the Envoy Gateway SecurityPolicy resource, which attaches to the HTTPRoute generated by an AIGatewayRoute. After the caller is identified, selected claims are copied into request headers that token quotas and usage metering consume as the per-tenant key.

This turns a per-consumer credential, such as an SSO (Single Sign-On) token or an API key, into an identity that quota and metering can act on. It is the foundation of multi-tenant model serving on a shared gateway.

Use Cases

  • A developer obtains a JWT (JSON Web Token) from the platform identity provider and calls the gateway with it, so the gateway enforces per-user token quotas.
  • A CI job presents a service-account token so that automated traffic is attributed to a team rather than an individual.
  • A machine consumer that cannot run an interactive login presents a static API key that maps to a known tenant.

Prerequisites

  1. Envoy AI Gateway is installed. See Install Envoy AI Gateway.
  2. An AIGatewayRoute already routes requests to one or more backends.
  3. For the OIDC/JWT path: an OIDC issuer with a reachable JWKS endpoint. The platform's built-in identity provider, Dex, is the default; any other OIDC issuer (Keycloak, Auth0, Okta, GitHub OIDC, an enterprise Entra ID tenant) also works as long as the gateway can reach its /.well-known/openid-configuration and JWKS URL.
  4. For the API-key path: cluster permission to create Secret objects in the gateway's namespace.
NOTE

Create the Gateway and AIGatewayRoute in a dedicated namespace (for example maas-system), not in the Envoy Gateway control-plane namespace envoy-gateway-system. A gateway placed in the control-plane namespace may not have the AI Gateway request-processing filter and SecurityPolicy applied to its listener, which silently breaks routing and policy enforcement. See Envoy AI Gateway.

Steps

Authenticate with OIDC or JWT

Validate tokens issued by an OIDC issuer. The platform's built-in Dex is the default issuer; it can also broker external identity sources, such as LDAP or another OIDC provider, so their users obtain platform tokens. Those connectors are configured in platform IdP (Identity Provider) management. For platform IdP configuration, see Identity Providers.

Any OIDC issuer with a reachable JWKS endpoint can be used. Replace the issuer and remoteJWKS.uri below with the issuer of your choice when consumers are not platform users — for example, an enterprise Keycloak realm or a SaaS IdP — so the gateway accepts their tokens without requiring a platform account.

Point the gateway at the OIDC issuer and map its claims to identity headers:

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: SecurityPolicy
metadata:
  name: maas-oidc-auth
  namespace: <your-namespace>
spec:
  targetRefs:
    - group: gateway.networking.k8s.io
      kind: HTTPRoute
      name: <aigatewayroute-name>  # HTTPRoute generated by your AIGatewayRoute
  jwt:
    providers:
      - name: platform-idp
        issuer: https://<platform-address>/dex
        remoteJWKS:
          uri: https://<platform-address>/dex/keys
        claimToHeaders:
          - claim: sub      # caller identity, used as the per-user quota and metering key
            header: x-user-id
          - claim: groups    # platform groups, used for per-department aggregation and tiers
            header: x-user-group
          - claim: email
            header: x-user-email
  • <platform-address>: the platform access address. Dex publishes its issuer at /dex and its JWKS at /dex/keys.
  • <aigatewayroute-name>: the name of the HTTPRoute generated by your AIGatewayRoute.
  • claimToHeaders: the bridge between identity and policy. The emitted headers (x-user-id, x-user-group) become the selector keys for token quotas and the label values for usage metering.
NOTE

The groups claim reflects the caller's platform groups, populated by the configured IdP connector. To key policies on an attribute the platform does not emit by default, such as a subscription tier, add the claim in the upstream connector and map it with an extra claimToHeaders entry.

TIP

To roll out without blocking traffic, set jwt.optional: true first and observe. Remove it once all consumers present valid tokens.

Authenticate with an API key

WARNING

Known issue on Envoy Gateway ≤ v1.5.x. SecurityPolicy.apiKeyAuth is translated correctly (the api_key_auth filter is added to the listener with credentials, and an ApiKeyAuthPerRoute config is attached to the route), but Envoy Gateway does not enable the filter per route. The result: the policy reports Accepted=True, requests with a wrong or missing key still return 200, and no x-user-id is injected. Until a fixed EG release is in place, apply the EnvoyPatchPolicy shown at the end of this section to enable the filter at the listener level. Verify on your cluster by sending one request with no key after applying the policy; if you get 200 instead of 401, the patch is required.

For machine consumers that cannot perform an OIDC flow, validate a static API key instead. There is no issuance service: the cluster administrator generates a random string per consumer, stores it in a Secret, and shares it out of band. The gateway's data plane validates each request by looking the presented value up in that Secret.

Generate one key per consumer and store them in a single Opaque Secret. Each data-map key is the client identifier that downstream policies see; each value is the API key the consumer presents:

kubectl -n <your-namespace> create secret generic maas-api-keys \
  --from-literal=alice="$(openssl rand -hex 32)" \
  --from-literal=ci-runner="$(openssl rand -hex 32)"

Bind the Secret to the route with a SecurityPolicy:

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: SecurityPolicy
metadata:
  name: maas-apikey-auth
  namespace: <your-namespace>
spec:
  targetRefs:
    - group: gateway.networking.k8s.io
      kind: HTTPRoute
      name: <aigatewayroute-name>  # HTTPRoute generated by your AIGatewayRoute
  apiKeyAuth:
    credentialRefs:
      - name: maas-api-keys     # Secret whose data keys are the client identifiers
    extractFrom:
      - headers:
          - X-API-Key           # dedicated header avoids the "Bearer " prefix problem of Authorization
    forwardClientIDHeader: x-user-id  # matched client identifier is injected as this header for downstream policies
    sanitize: true              # strip the raw API key from the request before it reaches the model backend
  • credentialRefs: one or more Opaque Secrets holding the credentials. Each data-map key is the client identifier, each value is the literal API key. Adding a consumer is a kubectl patch of one entry; revoking is a single key deletion.
  • extractFrom: where Envoy reads the presented key from. The filter does a literal-string compare, so prefer a dedicated header such as X-API-Key. Reusing Authorization requires storing the value with its Bearer prefix, which mixes badly with the OIDC path on the same gateway.
  • forwardClientIDHeader: the header that carries the matched client identifier to the upstream and to later filters. Use the same name as the OIDC claimToHeaders target (x-user-id) so token quotas and usage metering see one consistent key across both auth paths.
  • sanitize: prevents the raw API key from leaking to the model backend or being logged downstream.

Workaround for the EG ≤ v1.5.x enforcement gap (see the warning above). Apply this EnvoyPatchPolicy once per Gateway to enable the api_key_auth filter at the listener level. The patch is verified end-to-end on EG v1.5.0: with it in place, wrong/missing keys return 401 and forwardClientIDHeader injection works as documented. Remove it once you upgrade to a fixed EG release.

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyPatchPolicy
metadata:
  name: enable-apikey-auth
  namespace: envoy-gateway-system   # same namespace as the Gateway
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: Gateway
    name: <gateway-name>
  type: JSONPatch
  jsonPatches:
  - type: type.googleapis.com/envoy.config.listener.v3.Listener
    name: <gateway-namespace>/<gateway-name>/<listener-name>   # e.g. envoy-gateway-system/ai-gw/http
    operation:
      op: replace
      path: /default_filter_chain/filters/0/typed_config/http_filters/0/disabled
      value: false

Confirm the patch took effect:

kubectl get envoypatchpolicy enable-apikey-auth -n envoy-gateway-system \
  -o jsonpath='{.status.ancestors[*].conditions[?(@.type=="Programmed")].status}'
# expect: True

Verification

Confirm the policy is accepted. SecurityPolicy status is ancestor-scoped, so the jsonpath looks one level deeper than for most resources:

kubectl get securitypolicy <policy-name> -n <your-namespace> \
  -o jsonpath='{.status.ancestors[*].conditions[?(@.type=="Accepted")].status}'

The command returns True when the policy is programmed.

For the OIDC path, send a request with a valid token and confirm the upstream service receives the x-user-id, x-user-group, and x-user-email headers.

For the API-key path, send the matching X-API-Key and confirm the upstream sees x-user-id set to the matched client identifier:

curl -sS -H "X-API-Key: <alice-key>" \
  https://<gateway-host>/v1/chat/completions \
  -d '{"model":"<model>","messages":[{"role":"user","content":"ping"}]}'

A wrong or missing key returns 401 Unauthorized from the gateway before the request reaches any backend.

Learn More

Next Steps

After identity headers are propagated, configure Configuring Token Quotas to enforce per-tenant token budgets.