PyTorch模型导出为TensorRT格式测试

在实际部署场景中，将PyTorch模型转换为TensorRT格式能显著提升推理性能。以下是一个完整的转换流程示例。

环境准备

pip install torch torchvision tensorrt torch2trt

代码实现

import torch
import torch2trt
import numpy as np

class SimpleModel(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = torch.nn.Conv2d(3, 64, 3, padding=1)
        self.relu = torch.nn.ReLU()
        self.fc = torch.nn.Linear(64 * 32 * 32, 10)
    
    def forward(self, x):
        x = self.relu(self.conv1(x))
        x = x.view(x.size(0), -1)
        x = self.fc(x)
        return x

# 创建并测试模型
model = SimpleModel().eval()
input_tensor = torch.randn(1, 3, 32, 32).cuda()

torch_output = model(input_tensor)

# 转换为TensorRT格式
trt_model = torch2trt.torch2trt(
    model,
    [input_tensor],
    fp16_mode=True,
    max_workspace_size=1<<30
)

# 测试性能
import time

def benchmark(model, input_data, iterations=1000):
    model.eval()
    start = time.time()
    for _ in range(iterations):
        _ = model(input_data)
    end = time.time()
    return (end - start) / iterations

# 性能对比
pytorch_time = benchmark(model, input_tensor)
trt_time = benchmark(trt_model, input_tensor)

print(f"PyTorch平均耗时: {pytorch_time:.6f}s")
print(f"TensorRT平均耗时: {trt_time:.6f}s")
print(f"性能提升: {pytorch_time/trt_time:.2f}x")

测试结果

在NVIDIA RTX 3090上测试：

PyTorch原生：0.0015s/次
TensorRT：0.0004s/次
性能提升：3.75倍

注意：实际部署时需根据硬件环境调整workspace大小和精度模式。

PyTorch模型导出为TensorRT格式测试

PyTorch模型导出为TensorRT格式测试

环境准备

代码实现

测试结果

讨论

选择表情