QRB ROS NN Inference - Qualcomm Dragonwing Documentation

qrb_ros_nn_inference is a generic ROS 2 node that loads a neural-network model and runs inference on the Hexagon HTP NPU via the Qualcomm AI Engine Direct (QNN) SDK. You point it at a model file, it subscribes to an input topic, and publishes raw inference results on an output topic — no model-specific wiring required.

This is the “can I run my model on the NPU from ROS?” answer in one node. Drop in a .tflite / .so / .bin exported from Qualcomm AI Hub (or your own QNN export), point the node at it, and publish / subscribe.

What it is

Underneath, the node wraps qrb_inference_manager — a small C++ library that calls the QNN APIs and the QNN delegate for TensorFlow Lite. The ROS layer just adds a subscription, a publication, and parameter-driven backend selection.

Supported model formats

Format	When to use
`.tflite`	TFLite models exported from AI Hub or trained locally.
`.so`	Pre-compiled QNN binaries (best HTP performance, locked to a target).
`.bin`	QNN context binaries.

Per upstream README: .tflite inference is not supported on qrb_ros_nn_inference 1.1.0-jazzy. If you’re on that release and need TFLite, build from source on main or use the hand-rolled approach in npu-workflows.mdx.

Quick start

Install on Qualcomm Ubuntu

sudo add-apt-repository ppa:ubuntu-qcom-iot/qcom-ppa
sudo add-apt-repository ppa:ubuntu-qcom-iot/qirp
sudo apt update
sudo apt install ros-jazzy-qrb-ros-nn-inference

Run with your model

ros2 run qrb_ros_nn_inference qrb_ros_nn_inference \
  --ros-args \
    -p model_path:=/path/to/your_model.so \
    -p backend_option:=htp

Then publish your input on the configured input topic and subscribe to the output topic. See the upstream API reference for the full parameter list.

Why this helps

Alternative	Short take
Hand-rolled TFLite node — see `npu-workflows.mdx`	Maximum control; you own preprocessing, delegate loading, topic wiring. Useful for learning.
CPU-only TFLite / ONNX node	Works everywhere, but no NPU — defeats the point of Qualcomm hardware.
`qrb_ros_samples` packaged pipelines	Model-specific wrappers (object detection, segmentation, etc.); less flexible than a generic loader.

A generic ROS 2 node that loads a .tflite / .so / .bin model on the Hexagon HTP NPU via QNN. For evaluators: drop in your own model, configure the node, and you’re publishing inference results on a ROS topic. Pre/post-processing is up to you (or use qrb_ros_tensor_process for YOLO-shaped tensors).

npu-workflows.mdx — hand-rolled depth-estimation pipeline using the QNN TFLite delegate directly. Useful if you want to understand what qrb_ros_nn_inference automates.
qrb-ros-samples.mdx — model-specific reference pipelines built around this node.
Upstream: qualcomm-qrb-ros/qrb_ros_nn_inference.

​What it is

​Supported model formats

​Quick start

​Why this helps

​Related

What it is

Supported model formats

Quick start

Why this helps

Related