CANN/AMCT基于精度自动校准

张

张建站

2026/6/5 5:40:24

10分钟阅读

accuracy_based_auto_calibration【免费下载链接】amctAMCT是CANN提供的昇腾AI处理器亲和的模型压缩工具仓。项目地址: https://gitcode.com/cann/amct产品支持情况产品是否支持Ascend 950PR/Ascend 950DT√Atlas A3 训练系列产品/Atlas A3 推理系列产品√Atlas A2 训练系列产品/Atlas A2 推理系列产品√功能说明根据用户输入的模型、配置文件进行自动的校准过程搜索得到一个满足目标精度的量化配置输出可以在ONNX Runtime环境下做精度仿真的fake_quant模型和可在AI处理器上做推理的deploy模型。函数原型accuracy_based_auto_calibration(model,model_evaluator,config_file,record_file,save_dir,input_data,input_names,output_names,dynamic_axes,strategyBinarySearch,sensitivityCosineSimilarity)参数说明参数名输入/输出说明model输入含义用户的torch model。数据类型torch.nn.Modulemodel_evaluator输入含义自动量化进行校准和评估精度的Python实例。数据类型Python实例config_file输入含义用户生成的量化配置文件。数据类型stringrecord_file输入含义存储量化因子的路径如果该路径下已存在文件则会被重写。数据类型stringsave_dir输入含义模型存放路径。该路径需要包含模型名前缀例如./quantized_model/*model。数据类型stringinput_data输入含义模型的输入数据。一个torch.tensor会被等价为tupletorch.tensor。数据类型tupleinput_names输入含义模型的输入的名称用于modfied_onnx_file中显示。默认值None数据类型list(string)output_names输入含义模型的输出的名称用于modfied_onnx_file中显示。默认值None数据类型list(string)dynamic_axes输入含义对模型输入输出动态轴的指定例如对于输入inputsNCHWN、H、W为不确定大小输出outputsNLN为不确定大小则{inputs: [0,2,3], outputs: [0]}。默认值None数据类型dictstring, dictpython:int, string or dictstring, list(int)strategy输入含义搜索满足精度要求的量化配置的策略默认是二分法策略。数据类型string或Python实例默认值BinarySearchsensitivity输入含义评价每一层量化层对于量化敏感度的指标默认是余弦相似度。数据类型string或Python实例默认值CosineSimilarity返回值说明无调用示例import amct_pytorch as amct from amct_pytorch.common.auto_calibration import AutoCalibrationEvaluatorBase # You need to implement the AutoCalibrationEvaluators calibration(), evaluate() and metric_eval() funcs class AutoCalibrationEvaluator(AutoCalibrationEvaluatorBase): subclass of AutoCalibrationEvaluatorBase def __init__(self, target_loss, batch_num): super(AutoCalibrationEvaluator, self).__init__() self.target_loss target_loss self.batch_num batch_num def calibration(self, model): implement the calibration function of AutoCalibrationEvaluatorBase calibration() need to finish the calibration inference procedure so the inference batch num need to the batch_num pass to create_quant_config model_forward(modelmodel, batch_size32, iterationsself.batch_num) def evaluate(self, model): implement the evaluate function of AutoCalibrationEvaluatorBase params: model in torch.nn.module return: the accuracy of input model on the eval dataset, or other metric which can describe the accuracy of model top1, _ model_forward(modelmodel, batch_size32, iterations5) if torch.cuda.is_available(): torch.cuda.empty_cache() return top1 def metric_eval(self, original_metric, new_metric): implement the metric_eval function of AutoCalibrationEvaluatorBase params: original_metric: the returned accuracy of evaluate() on non quantized model new_metric: the returned accuracy of evaluate() on fake quant model return: [0]: whether the accuracy loss between non quantized model and fake quant model can satisfy the requirement [1]: the accuracy loss between non quantized model and fake quant model loss original_metric - new_metric if loss * 100 self.target_loss: return True, loss return False, loss ... # 1. step1 create quant config json file config_json_file os.path.join(TMP, config.json) skip_layers [] batch_num 2 amct.create_quant_config( config_json_file, model, input_data, skip_layers, batch_num ) # 2. step2 construct the instance of AutoCalibrationEvaluator evaluator AutoCalibrationEvaluator(target_loss0.5, batch_numbatch_num) # 3. step3 using the accuracy_based_auto_calibration to quantized the model record_file os.path.join(TMP, scale_offset_record.txt) result_path os.path.join(PATH, result/mobilenet_v2) amct.accuracy_based_auto_calibration( modelmodel, model_evaluatorevaluator, config_fileconfig_json_file, record_filerecord_file, save_dirresult_path, input_datainput_data, input_names[input], output_names[output], dynamic_axes{ input: {0: batch_size}, output: {0: batch_size} }, strategyBinarySearch, sensitivityCosineSimilarity )落盘文件说明精度仿真模型文件ONNX格式的模型文件模型名中包含fake_quant可以在ONNX Runtime环境进行精度仿真。部署模型文件ONNX格式的模型文件模型名中包含deploy经过ATC转换工具转换后可部署到AI处理器。量化因子记录文件在接口中的record_file中写入量化因子。敏感度信息文件该文件记录了待量化层对于量化的敏感度信息根据该信息进行量化回退层的选择。自动量化回退历史记录文件记录的回退层的信息。【免费下载链接】amctAMCT是CANN提供的昇腾AI处理器亲和的模型压缩工具仓。项目地址: https://gitcode.com/cann/amct创作声明：本文部分内容由AI辅助生成（AIGC），仅供参考

机器学习模型生产化落地：从Notebook到稳定服务的四大核心实践

1. 项目概述：这不是一次“部署”，而是一场从实验室到产线的系统性迁移“From Notebook to Production: Running ML in the Real World (Part 4)”——这个标题里藏着太多被轻描淡写却重若千钧的词。“Notebook”不是指纸质本子，而是Jupyter里…...

2026/6/5 5:29:58 阅读更多 →

Python混合并发架构：asyncio+ProcessPool实现类Go协程体验

Python 3.14 Unlocks True Multicore Power, Go Lang level concurrency——这个标题一出来，我盯着看了三分钟，手边刚泡的茶都凉了。不是因为兴奋，而是第一反应：这根本不存在。截至2024年10月，CPython官方最新稳定版是…...

2026/6/5 5:25:54 阅读更多 →

机器学习模型生产可观测性与弹性治理实战指南

1. 项目概述：当模型走出Jupyter，真正开始呼吸真实世界空气“From Notebook to Production: Running ML in the Real World (Part 4)”——这个标题本身就像一句暗号，专为那些在Jupyter里调通了模型、画出了漂亮ROC曲线、却在部署时被生产环境…...

2026/6/5 5:19:56 阅读更多 →