推理加速对比:PyTorch vs TensorFlow Lite vs ONNX Runtime
实验环境
- Python 3.9
- PyTorch 2.0
- TensorFlow Lite 2.13
- ONNX Runtime 1.15
- 测试模型:ResNet50 (224x224输入)
模型转换与部署
PyTorch原生推理
import torch
model = torch.hub.load('pytorch/vision:v0.10.0', 'resnet50', pretrained=True)
model.eval()
example = torch.rand(1, 3, 224, 224)
traced_model = torch.jit.trace(model, example)
torch.jit.save(traced_model, "resnet50_traced.pt")
ONNX导出
import torch.onnx
torch.onnx.export(model, example, "resnet50.onnx",
export_params=True, opset_version=11)
TensorFlow Lite转换
import tensorflow as tf
converter = tf.lite.TFLiteConverter.from_saved_model('resnet50_tf')
tflite_model = converter.convert()
with open("resnet50.tflite", "wb") as f:
f.write(tflite_model)
性能测试
使用torchbench进行基准测试,100次推理平均耗时:
- PyTorch JIT Trace: 12.3ms
- ONNX Runtime (CPU): 15.7ms
- TensorFlow Lite: 18.2ms
- PyTorch Eager Mode: 25.1ms
结论
PyTorch JIT Trace在保持原生性能的同时,显著优于其他框架。ONNX Runtime虽有优化,但受限于CPU调度机制,延迟较高。TensorFlow Lite在移动端表现优异,但在服务器端推理场景中性能略逊。
建议:生产环境优先选用PyTorch JIT Trace方案,若需跨平台兼容性可考虑ONNX Runtime。

讨论