Skip to main content
You can convert ONNX models to LiteRT format and optimize them for on-device inference. Converting an ONNX model to LiteRT is a two-step process: ONNX → TensorFlow (SavedModel) → LiteRT

Convert an ONNX model to TensorFlow

Use the onnx-tf module to convert an ONNX model to a TensorFlow SavedModel. This is the commonly used and stable approach.
  1. Install the required dependencies:
    pip install onnx onnx-tf tensorflow
    
  2. Convert the ONNX model to TensorFlow SavedModel format:
    onnx_model_path=my_model.onnx
    tf_model_path=tf_model
    onnx-tf convert -i ${onnx_model_path} -o ${tf_model_path}
    

Convert TensorFlow to LiteRT

Convert the TensorFlow SavedModel to LiteRT format:
import tensorflow as tf

converter = tf.lite.TFLiteConverter.from_saved_model("tf_model")
tflite_model = converter.convert()

with open("model.tflite", "wb") as f:
    f.write(tflite_model)

Quantize the model

To quantize the converted LiteRT model for improved performance on the NPU, see Quantize models using full integer quantization.