This tutorial will guide you through setting up an AI and MCP Gateway using KubeLB with KGateway to securely manage Large Language Model (LLM) requests and MCP tool servers.
KubeLB leverages KGateway, a CNCF Sandbox project (accepted March 2025), to provide advanced AI Gateway capabilities. KGateway is built on Envoy and implements the Kubernetes Gateway API specification, offering:
KGateway supports the Gateway API Inference Extension which introduces:
InferenceModel CRD: Define LLM models and their endpointsInferencePool CRD: Group models for load balancing and failoverUpdate values.yaml for KubeLB manager chart to enable KGateway with AI capabilities:
kubelb:
enableGatewayAPI: true
kubelb-addons:
enabled: true
kgateway:
enabled: true
gateway:
aiExtension:
enabled: true
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: ai-gateway
namespace: kubelb
labels:
app: ai-gateway
spec:
gatewayClassName: kgateway
infrastructure:
parametersRef:
name: ai-gateway
group: gateway.kgateway.dev
kind: GatewayParameters
listeners:
- protocol: HTTP
port: 8080
name: http
allowedRoutes:
namespaces:
from: All
apiVersion: gateway.kgateway.dev/v1alpha1
kind: GatewayParameters
metadata:
name: ai-gateway
namespace: kubelb
labels:
app: ai-gateway
spec:
kube:
aiExtension:
enabled: true
ports:
- name: ai-monitoring
containerPort: 9092
image:
registry: cr.kgateway.dev/kgateway-dev
repository: kgateway-ai-extension
tag: v2.1.0-main
service:
type: LoadBalancer
This example shows how to set up secure access to OpenAI through the AI Gateway.
Create a Kubernetes secret with your OpenAI API key:
export OPENAI_API_KEY="sk-..."
kubectl create secret generic openai-secret \
--from-literal=Authorization="Bearer ${OPENAI_API_KEY}" \
--namespace kubelb
Define an AI Backend that uses the secret for authentication:
apiVersion: gateway.kgateway.dev/v1alpha1
kind: Backend
metadata:
name: openai
namespace: kubelb
spec:
type: AI
ai:
llm:
provider:
openai:
authToken:
kind: SecretRef
secretRef:
name: openai-secret
namespace: kubelb
model: "gpt-3.5-turbo"
Route traffic to the OpenAI backend:
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: openai-route
namespace: kubelb
spec:
parentRefs:
- name: ai-gateway
namespace: kubelb
rules:
- matches:
- path:
type: PathPrefix
value: /openai
filters:
- type: URLRewrite
urlRewrite:
path:
type: ReplaceFullPath
replaceFullPath: /v1/chat/completions
backendRefs:
- name: openai
namespace: kubelb
group: gateway.kgateway.dev
kind: Backend
Get the Gateway’s external IP:
kubectl get gateway ai-gateway -n kubelb
export GATEWAY_IP=$(kubectl get svc -n kubelb ai-gateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
Send a test request:
curl -X POST "http://${GATEWAY_IP}/openai" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "user", "content": "Hello, how are you?"}
]
}'
Add rate limiting to control costs and prevent abuse:
apiVersion: gateway.kgateway.dev/v1alpha1
kind: RateLimitPolicy
metadata:
name: openai-ratelimit
namespace: kubelb
spec:
targetRef:
kind: HTTPRoute
name: openai-route
namespace: kubelb
limits:
- requests: 100
unit: hour
Similar to the AI Gateway, you can also use agentgateway to can connect to one or multiple MCP servers in any environment.
Please follow this guide to setup the MCP Gateway: MCP Gateway
For advanced configurations and features: