All three models (local + OpenAI + Anthropic) work through the MaaS gateway
using the same sk-oai-* API key minted via the MaaS API.
Demo: External Model Routing with Istio ServiceEntry & DestinationRule
I didn't add the model listing to this validation but you can see an example modifications to MaaS required in egress-ai-gateway-poc/patches/maas-api-external-model-listing.patch. This patch adds ConfigMap-based external model listing to the MaaS API — it reads from an external-model-registry ConfigMap in the MaaS namespace and merges those models into the GET /v1/models response. I have tested that a couple of weeks ago with ghcr.io/nerdalert/maas-api:external-models.
| Component | Version / Detail |
|---|---|
| Platform | OpenShift 4.20.6 (ROSA on AWS, us-east-1) |
| Istio | v1.29-latest via Sail Operator 1.29.0 |
| MaaS | Deployed via quickstart with sample model |
| Gateway | maas-default-gateway in openshift-ingress |
| Auth | Kuadrant AuthPolicy (MaaS API key -> provider key injection) |
export MAAS_HOST=$(kubectl get gateway maas-default-gateway -n openshift-ingress -o jsonpath='{.spec.listeners[0].hostname}')
export MAAS_URL=$(kubectl get gateway maas-default-gateway -n openshift-ingress -o jsonpath='{.status.addresses[0].value}')
export API_KEY=$(curl -sSk -X POST "https://${MAAS_HOST}/maas-api/v1/api-keys" -H "Authorization: Bearer $(oc whoami -t)" -H "Content-Type: application/json" -d '{"name":"test"}' | jq -r '.key')curl -sk "https://${MAAS_HOST}/llm/facebook-opt-125m-simulated/v1/chat/completions" -H "Authorization: Bearer ${API_KEY}" -H "Content-Type: application/json" -d '{"model":"facebook/opt-125m","messages":[{"role":"user","content":"Hello"}],"max_tokens":10}'{
"id": "chatcmpl-450d1d11-7c3d-5aae-a4a7-eea43ea5b250",
"model": "facebook/opt-125m",
"choices": [
{
"index": 0,
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": "To be "
}
}
],
"usage": {
"prompt_tokens": 1,
"completion_tokens": 2,
"total_tokens": 3
}
}curl -s "http://${MAAS_URL}/external/openai/v1/chat/completions" -H "Host: ${MAAS_HOST}" -H "Authorization: Bearer ${API_KEY}" -H "Content-Type: application/json" -d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"Hello"}],"max_tokens":5,"temperature":0}'{
"id": "chatcmpl-DJD5AMFiNeS3A6dcabz97xyLhJsUL",
"model": "gpt-4o-mini-2024-07-18",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I"
},
"finish_reason": "length"
}
],
"usage": {
"prompt_tokens": 8,
"completion_tokens": 5,
"total_tokens": 13
},
"system_fingerprint": "fp_a1681c17ec"
}curl -s "http://${MAAS_URL}/external/anthropic/v1/messages" -H "Host: ${MAAS_HOST}" -H "Authorization: Bearer ${API_KEY}" -H "Content-Type: application/json" -d '{"model":"claude-sonnet-4-20250514","messages":[{"role":"user","content":"Hello"}],"max_tokens":5}'{
"model": "claude-sonnet-4-20250514",
"id": "msg_01KH2kuMRhB99ZMZ26mkXaiZ",
"content": [
{
"type": "text",
"text": "Hello! How are you"
}
],
"stop_reason": "max_tokens",
"usage": {
"input_tokens": 8,
"output_tokens": 5
}
}| Model | Provider | Token | Response | Status |
|---|---|---|---|---|
facebook/opt-125m |
Local (KServe) | sk-oai-* |
"To be " |
PASS |
gpt-4o-mini |
OpenAI | sk-oai-* |
"Hello! How can I" |
PASS |
claude-sonnet-4-20250514 |
Anthropic | sk-oai-* |
"Hello! How are you" |
PASS |
All three models use the same MaaS API key. The client never provides the provider API key — the gateway injects it.
1. Client sends: Authorization: Bearer sk-oai-* (MaaS API key)
2. AuthPolicy: Validates API key via maas-api callback
Replaces Authorization with provider key
3. HTTPRoute: Matches path, rewrites URL, sets Host header
4. Istio resources: ExternalName Svc -> ServiceEntry -> DestinationRule (TLS)
5. Provider: Receives HTTPS request with valid credentials
| Aspect | Local | OpenAI | Anthropic |
|---|---|---|---|
| Endpoint | /llm/.../v1/chat/completions |
/external/openai/v1/chat/completions |
/external/anthropic/v1/messages |
| Protocol | HTTPS (in-cluster) | HTTP -> HTTPS (TLS origination) | HTTP -> HTTPS (TLS origination) |
| Auth injected | N/A (in-cluster) | Authorization: Bearer <openai-key> |
x-api-key: <anthropic-key> |
| Extra headers | None | Host: api.openai.com |
Host: api.anthropic.com, anthropic-version |
| Istio resources | None needed | ServiceEntry + DestinationRule + ExternalName Svc | Same |
export MAAS_HOST=$(kubectl get gateway maas-default-gateway -n openshift-ingress -o jsonpath='{.spec.listeners[0].hostname}')
export MAAS_URL=$(kubectl get gateway maas-default-gateway -n openshift-ingress -o jsonpath='{.status.addresses[0].value}')
export API_KEY=$(curl -sSk -X POST "https://${MAAS_HOST}/maas-api/v1/api-keys" -H "Authorization: Bearer $(oc whoami -t)" -H "Content-Type: application/json" -d '{"name":"test"}' | jq -r '.key')
# 1. Local model (facebook/opt-125m)
curl -sk "https://${MAAS_HOST}/llm/facebook-opt-125m-simulated/v1/chat/completions" -H "Authorization: Bearer ${API_KEY}" -H"Content-Type: application/json" -d '{"model":"facebook/opt-125m","messages":[{"role":"user","content":"Hello"}],"max_tokens":10}'
{"id":"chatcmpl-5f547bf0-9f2a-5a75-a41b-2e254521f8b3","created":1773471237,"model":"facebook/opt-125m","usage":{"prompt_tokens":1,"completion_tokens":8,"total_tokens":9},"object":"chat.completion","kv_transfer_params":null,"choices":[{"index":0,"finish_reason":"stop","message":{"role":"assistant","content":"I am your AI assistant, how can "}}]}ubuntu@ip-172-31-33-128:~/mcp/rhoai-observability # 2. OpenAI (gpt-4o-mini)-mini)
# 2. OpenAI (gpt-4o-mini)
curl -s "http://${MAAS_URL}/external/openai/v1/chat/completions" -H "Host: ${MAAS_HOST}" -H "Authorization: Bearer ${API_KEY}" -H "Content-Type: application/json" -d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"Hello"}],"max_tokens":5}'
{
"id": "chatcmpl-DJDApznNYlh1O0GtTAqHVnJc4YKJ4",
"object": "chat.completion",
"created": 1773471243,
"model": "gpt-4o-mini-2024-07-18",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I",
"refusal": null,
"annotations": []
},
"logprobs": null,
"finish_reason": "length"
}
],
"usage": {
"prompt_tokens": 8,
"completion_tokens": 5,
"total_tokens": 13,
"prompt_tokens_details": {
"cached_tokens": 0,
"audio_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"audio_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
}
},
"service_tier": "default",
"system_fingerprint": "fp_a1681c17ec"
}
$
# 3. Anthropic (claude-sonnet-4-20250514)
curl -s "http://${MAAS_URL}/external/anthropic/v1/messages" -H "Host: ${MAAS_HOST}" -H "Authorization: Bearer ${API_KEY}" -H "Content-Type: application/json" -d '{"model":"claude-sonnet-4-20250514","messages":[{"role":"user","content":"Hello"}],"max_tokens":5}'
{"model":"claude-sonnet-4-20250514","id":"msg_014kmQoaKKKLhKTM9D3jD3Dx","type":"message","role":"assistant","content":[{"type":"text","text":"Hello! How are you"}],"stop_reason":"max_tokens","stop_sequence":null,"usage":{"input_tokens":8,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"cache_creation":{"ephemeral_5m_input_tokens":0,"ephemeral_1h_input_tokens":0},"output_tokens":5,"service_tier":"standard","inference_geo":"not_available"}