Kubernetes Ingress规则配置TensorFlow服务
在TensorFlow Serving微服务架构中,Ingress作为流量入口,需要精确配置才能实现负载均衡和高可用部署。
基础环境准备
首先创建TensorFlow Serving Deployment和Service:
apiVersion: apps/v1
kind: Deployment
metadata:
name: tensorflow-serving
spec:
replicas: 3
selector:
matchLabels:
app: tensorflow-serving
template:
metadata:
labels:
app: tensorflow-serving
spec:
containers:
- name: tensorflow-serving
image: tensorflow/serving:latest
ports:
- containerPort: 8501
- containerPort: 8500
---
apiVersion: v1
kind: Service
metadata:
name: tensorflow-service
spec:
selector:
app: tensorflow-serving
ports:
- port: 8501
targetPort: 8501
Ingress配置规则
配置Ingress控制器,实现请求路由和负载均衡:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: tensorflow-ingress
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
nginx.ingress.kubernetes.io/upstream-hash-by: "$request_uri"
nginx.ingress.kubernetes.io/proxy-body-size: "100m"
spec:
rules:
- host: model.example.com
http:
paths:
- path: /predict
pathType: Prefix
backend:
service:
name: tensorflow-service
port:
number: 8501
- path: /model
pathType: Prefix
backend:
service:
name: tensorflow-service
port:
number: 8501
Docker容器化注意事项
在Dockerfile中确保端口暴露和启动命令正确:
FROM tensorflow/serving:latest
COPY model /models/model
EXPOSE 8501 8500
ENTRYPOINT ["tensorflow_model_server"]
CMD ["--model_name=model", "--model_base_path=/models/model"]
负载均衡配置
通过Ingress的负载均衡策略,实现请求分发到多个TensorFlow实例:
- 使用
nginx.ingress.kubernetes.io/upstream-hash-by进行会话保持 - 配置
proxy-body-size处理大模型推理请求 - 设置健康检查端点
/v1/models/model验证服务状态
验证配置是否生效:
kubectl get ingress tensorflow-ingress
kubectl describe ingress tensorflow-ingress

讨论