TF Lite的Toco转换器参数说明——针对量化感知训练

Question

TF Lite的Toco转换器参数说明——针对量化感知训练

pythonpython-3.xtensorflowtensorflow-lite

31

最近我在尝试追踪关于使用TPU支持部署TF模型的错误。我可以成功运行没有TPU支持的模型，但一旦启用量化，就会迷失方向。

我处于以下情况:

创建并训练了一个模型
创建了模型的评估图
将模型冻结并将结果保存为协议缓冲区
成功转换并部署了没有TPU支持的模型

对于最后一点，我使用了TFLiteConverter的Python API。生成功能性tflite模型的脚本为：

import tensorflow as tf

graph_def_file = 'frozen_model.pb'
inputs = ['dense_input']
outputs = ['dense/BiasAdd']

converter = tf.lite.TFLiteConverter.from_frozen_graph(graph_def_file, inputs, outputs)
converter.inference_type = tf.lite.constants.FLOAT
input_arrays = converter.get_input_arrays()

converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_SIZE]

tflite_model = converter.convert()

open('model.tflite', 'wb').write(tflite_model)

这告诉我到目前为止我的方法似乎还可以。现在，如果我想要使用Coral TPU棒，我必须量化我的模型（我在训练过程中已经考虑到了这一点）。我所要做的就是修改我的转换脚本。我发现我需要将它改为：

import tensorflow as tf

graph_def_file = 'frozen_model.pb'
inputs = ['dense_input']
outputs = ['dense/BiasAdd']

converter = tf.lite.TFLiteConverter.from_frozen_graph(graph_def_file, inputs, outputs)
converter.inference_type = tf.lite.constants.QUANTIZED_UINT8      ## Indicates TPU compatibility
input_arrays = converter.get_input_arrays()

converter.quantized_input_stats = {input_arrays[0]: (0., 1.)}     ## mean, std_dev
converter.default_ranges_stats = (-128, 127)                      ## min, max values for quantization (?)
converter.allow_custom_ops = True                                 ## not sure if this is needed

## REMOVED THE OPTIMIZATIONS ALTOGETHER TO MAKE IT WORK

tflite_model = converter.convert()

open('model.tflite', 'wb').write(tflite_model)

这个tflite模型能够在Python API解释器中加载后产生结果，但我无法理解它们的含义。此外，关于如何选择平均值，标准差和最小/最大范围，没有（或者如果有，它被隐藏得很好）文档。另外，在使用edgetpu_compiler进行编译并部署（使用C++ API加载）后，我收到了一个错误：

INFO: Initialized TensorFlow Lite runtime.
ERROR: Failed to prepare for TPU. generic::failed_precondition: Custom op already assigned to a different TPU.
ERROR: Node number 0 (edgetpu-custom-op) failed to prepare.

Segmentation fault

我想我在转换过程中可能错过了某个标志。但由于文档也缺乏这方面的资料，我不能确定。

简而言之：

参数std_dev、min/max是什么意思，它们如何相互作用？
转换过程中我做错了什么？

非常感谢任何帮助或指导！

编辑：我已经在Github问题页面上发布了完整的测试代码。请随意尝试。

- DocDriven

可能以后会解释它们，但根据我的经验，后量化并不是很好，只能用来查看模型在量化后的表现。为了充分利用量化过程，您需要进行量化感知训练。 - Chan Kha Vu

@FalconUA：我认为我已经执行了量化感知训练（请参见Github链接）。如果您决定写一个答案，也许您可以解释一下事后量化和量化感知训练的主要区别，因为我对这个问题还很陌生。那将是太棒了！ - DocDriven

1

请看：https://medium.com/tensorflow/tensorflow-model-optimization-toolkit-post-training-integer-quantization-b4964a1ea9ba - Alex Cohn

这个例子可能有所帮助：https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/tutorials/post_training_integer_quant.ipynb - mdaoust

请翻译以下有关编程的内容：参见https://dev59.com/zFQJ5IYBdhLWcg3wAAxS#58096430，了解mean和stddev的解释。 - MohamedEzz

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- mdaoust · Accepted Answer

您不应该手动设置量化统计数据。

您尝试过进行后训练量化教程吗？

https://www.tensorflow.org/lite/performance/post_training_integer_quant

基本上，他们设置了量化选项：

converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8

然后他们向转换器传递一个“代表性数据集”，以便转换器可以运行模型几个批次以收集必要的统计信息：

def representative_data_gen():
  for input_value in mnist_ds.take(100):
    yield [input_value]

converter.representative_dataset = representative_data_gen

虽然有量化训练的选项，但后期量化训练通常更容易。