Export an ONNX model to LiteRT - Qualcomm Dragonwing Documentation

You can convert ONNX models to LiteRT format and optimize them for on-device inference. Converting an ONNX model to LiteRT is a two-step process: ONNX → TensorFlow (SavedModel) → LiteRT

Convert an ONNX model to TensorFlow

Use the onnx-tf module to convert an ONNX model to a TensorFlow SavedModel. This is the commonly used and stable approach.

Install the required dependencies:
```
pip install onnx onnx-tf tensorflow
```

Convert the ONNX model to TensorFlow SavedModel format:

onnx_model_path=my_model.onnx
tf_model_path=tf_model
onnx-tf convert -i ${onnx_model_path} -o ${tf_model_path}

Convert TensorFlow to LiteRT

Convert the TensorFlow SavedModel to LiteRT format:

import tensorflow as tf

converter = tf.lite.TFLiteConverter.from_saved_model("tf_model")
tflite_model = converter.convert()

with open("model.tflite", "wb") as f:
    f.write(tflite_model)

Quantize the model

To quantize the converted LiteRT model for improved performance on the NPU, see Quantize models using full integer quantization.

Export a PyTorch model to LiteRT Export a TensorFlow model to LiteRT

⌘I

​Convert an ONNX model to TensorFlow

​Convert TensorFlow to LiteRT

​Quantize the model

Convert an ONNX model to TensorFlow

Convert TensorFlow to LiteRT

Quantize the model