Supported LiteRT runtimes - Qualcomm Dragonwing Documentation

LiteRT supports hardware acceleration through delegates, which offload graph execution to specialized hardware. The following delegates are available on Qualcomm Dragonwing IoT platforms.

CPU delegate

The XNNPACK delegate uses the XNNPACK library to accelerate LiteRT models on CPUs. XNNPACK is an open-source library from Google that:

Provides an optimized implementation of neural network operators for Arm CPUs.
Uses low-level CPU instructions, such as the Arm® Neon™ instruction set, to optimize operators for efficient execution.

The XNNPACK delegate supports models in both 32-bit floating-point and int8 formats. For more information, see XNNPACK back-end for TensorFlow Lite. To run a LiteRT model using the XNNPACK delegate, see Deploy a LiteRT model.

GPU delegate

The GPU open-source delegate accelerates LiteRT models on vendor-specific GPUs, including the Qualcomm Adreno GPU. It uses OpenCL kernels to run neural network operations within a LiteRT model execution graph on the GPU, improving parallel-processing performance. The GPU delegate supports the following model precisions on the Adreno GPU:

16-bit floating-point
32-bit floating-point

For more information, see GPU delegates for LiteRT. To run a LiteRT model using the GPU delegate, see Deploy a LiteRT model.

HTP delegate

The Qualcomm AI Runtime delegate is a proprietary delegate designed for hardware acceleration on Qualcomm platforms. It is based on the external delegate interface of LiteRT and can offload part or all of a LiteRT model to specialized Qualcomm hardware, including the Adreno GPU and the NPU. This delegate improves model execution performance and power efficiency by reducing the CPU workload. It uses the existing Qualcomm AI Runtime APIs and available backends to accelerate models. For more information, see Qualcomm AI Runtime (QAIRT) SDK. The Qualcomm AI Runtime delegate supports both 32-bit floating-point and int8 precision on available hardware. You can build applications using the following interfaces:

Qualcomm AI Runtime delegate interface
LiteRT external delegate interface

Both interfaces are available in standalone LiteRT applications. If you deploy LiteRT models using the Qualcomm IM SDK, the qtimltflite GStreamer plugin uses the QNN delegate. For more information, see Leverage external delegate.

​CPU delegate

​GPU delegate

​HTP delegate

CPU delegate

GPU delegate

HTP delegate