大模型测试自动化脚本分享

最近在社区里看到很多关于大模型测试的讨论，忍不住分享一个我踩坑后总结出来的自动化测试脚本。

背景说明： 我们团队需要对多个开源大模型进行一致性测试，手动测试效率太低。于是我写了一个基于Python的自动化测试框架。

核心代码：

import requests
import json

class ModelTester:
    def __init__(self, base_url):
        self.base_url = base_url
        
    def test_completion(self, prompt, max_tokens=100):
        payload = {
            "prompt": prompt,
            "max_tokens": max_tokens
        }
        response = requests.post(
            f"{self.base_url}/v1/completions",
            json=payload,
            timeout=30
        )
        return response.json()
        
    def validate_response(self, response):
        if 'choices' not in response:
            raise ValueError('Invalid response format')
        if not response['choices']:
            raise ValueError('No choices returned')
        return True

踩坑记录：

第一次运行时忘记设置超时时间，导致测试卡死
没有做响应格式验证，导致空响应无法被正确处理
忘记处理网络异常，应该加入try-catch机制

使用方法：

安装依赖：pip install requests
修改base_url为你的模型服务地址
运行测试：python test_model.py

欢迎大家在评论区交流测试经验！

讨论

选择表情