机器学习模型输入数据质量监控系统
核心监控指标配置
1. 数据分布稳定性
- 基线分布:使用训练集的特征分布作为参考
- KS统计量:
ks_2samp(current_data, baseline_data) - 相似度阈值:设置0.95的阈值,超过则触发告警
2. 数据完整性监控
- 缺失值比例:
missing_ratio = missing_count / total_count - 允许阈值:特征缺失率>30%时告警
3. 异常值检测
- Z-Score方法:
|z_score| > 3的数据点 - IQR方法:
Q1 - 1.5*IQR < value < Q3 + 1.5*IQR
实施步骤
import pandas as pd
from scipy import stats
import numpy as np
# 配置监控参数
class DataQualityMonitor:
def __init__(self, baseline_df):
self.baseline = baseline_df
self.thresholds = {
'ks_threshold': 0.95,
'missing_threshold': 0.30,
'zscore_threshold': 3.0
}
def check_distribution_stability(self, current_df):
ks_scores = []
for col in self.baseline.columns:
if col in current_df.columns:
ks = stats.ks_2samp(
self.baseline[col].dropna(),
current_df[col].dropna()
).statistic
ks_scores.append(ks)
return np.mean(ks_scores) < self.thresholds['ks_threshold']
def check_missing_values(self, current_df):
missing_ratio = current_df.isnull().sum() / len(current_df)
return (missing_ratio > self.thresholds['missing_threshold']).any()
# 告警配置
ALERT_CONFIG = {
'email': 'devops@company.com',
'slack_channel': '#ml-monitoring',
'severity': 'high'
}
复现步骤
- 初始化监控器:
monitor = DataQualityMonitor(training_data) - 每小时执行检查:
monitor.check_distribution_stability(new_batch) - 配置告警:当任一指标异常时,发送邮件通知

讨论