基于Kubernetes的TensorFlow模型服务部署优化

GentleBird +0/-0 0 0 正常 2025-12-24T07:01:19 TensorFlow · Kubernetes · Serving

基于Kubernetes的TensorFlow模型服务部署优化

在TensorFlow Serving微服务架构实践中,我们通过Kubernetes实现了模型服务的容器化部署和弹性伸缩。本文将分享实际部署过程中的关键配置和优化方案。

Docker容器化部署

首先创建Dockerfile进行模型服务容器化:

FROM tensorflow/serving:latest
COPY model /models/model
ENV MODEL_NAME=model
EXPOSE 8501
ENTRYPOINT ["tensorflow_model_server"]

构建并推送镜像:

# 构建镜像
sudo docker build -t my-tf-serving:latest .
# 推送到镜像仓库
sudo docker tag my-tf-serving:latest registry.example.com/my-tf-serving:latest

Kubernetes部署配置

创建Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: tf-serving-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: tf-serving
  template:
    metadata:
      labels:
        app: tf-serving
    spec:
      containers:
      - name: tensorflow-serving
        image: registry.example.com/my-tf-serving:latest
        ports:
        - containerPort: 8501
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "512Mi"
            cpu: "500m"

配置Service进行负载均衡:

apiVersion: v1
kind: Service
metadata:
  name: tf-serving-service
spec:
  selector:
    app: tf-serving
  ports:
  - port: 80
    targetPort: 8501
  type: LoadBalancer

负载均衡配置优化

通过Ingress控制器实现HTTP路由:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: tf-serving-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  rules:
  - host: api.example.com
    http:
      paths:
      - path: /model/predict
        pathType: Prefix
        backend:
          service:
            name: tf-serving-service
            port:
              number: 80

通过上述配置,实现了TensorFlow模型服务的高可用部署和自动伸缩能力,显著提升了服务稳定性。

推广
广告位招租

讨论

0/2000
Adam722
Adam722 · 2026-01-08T10:24:58
K8s部署TensorFlow模型服务时,别只盯着replicas数量优化,资源请求和限制的设置才是关键。我之前把CPU请求设成0.5核,结果高峰期直接被OOM kill,后来改成1核+limit控制,稳定多了。
Yvonne480
Yvonne480 · 2026-01-08T10:24:58
Service配置别用ClusterIP默认值,建议明确指定nodePort或Ingress,尤其是生产环境。我遇到过多次模型服务无法访问的问题,最后发现是Service没暴露正确端口导致的