模型输出值与历史平均值差异检测系统

核心监控指标

输出值偏差率：当前输出值与历史平均值的相对偏差，计算公式为 |current_output - historical_mean| / historical_mean
标准差倍数：输出值偏离历史均值的标准差倍数，|current_output - historical_mean| / historical_std
滑动窗口统计：30分钟、1小时、24小时的移动平均值作为历史基准

实现方案

import pandas as pd
import numpy as np
from datetime import datetime, timedelta

class OutputDeviationDetector:
    def __init__(self, window_size=3600):
        self.window_size = window_size  # 1小时窗口
        self.history_data = pd.DataFrame()
        
    def update_history(self, timestamp, output_value):
        self.history_data.loc[timestamp] = output_value
        # 滑动窗口过滤
        cutoff_time = datetime.now() - timedelta(seconds=self.window_size)
        self.history_data = self.history_data[self.history_data.index > cutoff_time]
        
    def detect_anomaly(self, current_output, threshold=3.0):
        if len(self.history_data) < 10:  # 数据量不足
            return False, 0.0
        
        historical_mean = self.history_data.mean()[0]
        historical_std = self.history_data.std()[0]
        
        # 计算标准差倍数
        z_score = abs(current_output - historical_mean) / historical_std
        
        return z_score > threshold, z_score

# 使用示例
detector = OutputDeviationDetector(window_size=3600)
# 更新历史数据
detector.update_history(datetime.now(), 0.85)
detector.update_history(datetime.now() - timedelta(minutes=10), 0.92)
# 检测异常
is_anomaly, score = detector.detect_anomaly(1.5, threshold=2.5)
print(f"异常检测结果：{is_anomaly}, Z分数：{score}")

告警配置方案

阈值设置：默认3σ（标准差倍数）作为告警阈值
告警级别：
- 轻微偏差：Z分数 2.0-2.5
- 中等偏差：Z分数 2.5-3.5
- 严重偏差：Z分数 > 3.5
通知渠道：集成Slack、钉钉、企业微信告警通知
重试机制：连续3次检测异常才触发正式告警

监控面板配置

在Prometheus + Grafana中添加以下指标：

# 模型输出值
model_output_value{model="xgboost"}

# 历史均值
model_history_mean{model="xgboost"}

# 异常检测分数
model_deviation_score{model="xgboost"}

模型输出值与历史平均值差异检测系统

模型输出值与历史平均值差异检测系统

核心监控指标

实现方案

告警配置方案

监控面板配置

讨论

选择表情