> ## Documentation Index > Fetch the complete documentation index at: https://dragonwingdocs.qualcomm.com/llms.txt > Use this file to discover all available pages before exploring further. # Run a LiteRT model on the NPU > Run an object detection application on the NPU of a Qualcomm Dragonwing device using the QNN delegate with Python. ## Summary Use the [LiteRT](https://ai.google.dev/edge/litert) runtime with the QNN delegate to accelerate AI inference on the NPU of Qualcomm® Dragonwing™ devices. This guide demonstrates the end-to-end workflow by deploying a quantized object detection model (YOLOX) that processes video input and outputs annotated frames with bounding boxes — either saved to a file or streamed to a display. **What you'll learn:** * Configure a Dragonwing device and deploy the QIM SDK Docker environment * Run a pre-built object detection application accelerated on the NPU * Understand the application code to adapt it for your own models and use cases *** ## Prerequisites Ensure you have the following before proceeding: | Requirement | Details | | ---------------- | --------------------------------------------------- | | **Hardware** | Qualcomm® Dragonwing™ device with NPU support | | **Host machine** | Linux or macOS with SSH client and Docker support | | **Network** | Wi-Fi or Ethernet connectivity on the target device | | **Software** | Docker installed on the target device | *** ## Step 1: Configure the device ### Enable Wi-Fi and SSH The device requires an internet connection to download artifacts needed for the sample application. If SSH and Wi-Fi are already configured, skip this step. Follow [Set up an SSH connection](https://dragonwingdocs.qualcomm.com/Technologies/Ethernet/get-started-with-ethernet#set-up-an-ssh-connection) to enable Wi-Fi and SSH on the device. ### Enable camera support (CamX) If you plan to use camera input, enable CamX on the platform: ```shell theme={null} echo -n "camx" > /var/data efivar -n 882f8c2b-9646-435f-8de5-f208ff80c1bd-VendorDtbOverlays -w -f /var/data efivar -n 882f8c2b-9646-435f-8de5-f208ff80c1bd-VendorDtbOverlays -p sync reboot ``` The device will reboot after this step. Wait for it to come back online before continuing. *** ## Step 2: Set up the Docker environment ### Pull the QIM SDK container image On the target device, pull the latest QIM SDK Docker image: ```shell theme={null} cd $HOME ``` ```shell theme={null} docker pull artifacts.codelinaro.org/iot-solutions-microservices/qimsdk:latest ``` ### Create required directories Create directories for storing artifacts, configuration files, models, and media: ```shell theme={null} mkdir -p /etc/cdi /etc/docker/env /etc/models /etc/labels /etc/media /root/media /root/models /root/labels /root/configs ``` ### Clone the SDK tools repository On your **host machine**, clone the QIM SDK Debian repository: ```shell theme={null} git clone https://git.codelinaro.org/clo/le/sdk-tools.git -b imsdk-tools.lnx.1.0 cd sdk-tools/qimsdk-debian/ ``` ### Copy configuration files to the device Transfer the CDI and environment files from your host machine to the target device: ```shell theme={null} scp -r cdi/_qli_2x_qimsdk.json root@:/etc/cdi/qimsdk.json ``` ```shell theme={null} scp -r env/_qli_2x_qimsdk.env root@:/etc/docker/env/qimsdk.env ``` Replace `` with the appropriate identifier for your target device (check the repository for available options) and `` with your device's IP address.

For instance, if the target device is Qualcomm Dragonwing™ RB3 Gen 2, then replace `` with qcs6490. ### Start the container Launch the QIM SDK container on the target device: ```shell theme={null} docker run -it -d \ --net host \ --env-file /etc/docker/env/qimsdk.env \ --device qualcomm.com/device=qimsdk \ -h qimsdk \ --name qimsdk \ artifacts.codelinaro.org/iot-solutions-microservices/qimsdk:latest ``` ### Access the container as root ```shell theme={null} export DOCKER_ID=$(docker ps -aq) docker exec -it ${DOCKER_ID} sh ``` To verify you are logged in as root, run `whoami` inside the container. The output should be `root`. *** ## Step 3: Install dependencies Inside the container (as root), install the LiteRT runtime and required packages. ### Install Python tooling ```shell theme={null} apt update apt install python3-pip python3-venv ``` ### Create a virtual environment and install Python packages ```shell theme={null} python3 -m venv venv-litert-demo --system-site-packages ``` ```shell theme={null} . venv-litert-demo/bin/activate ``` ```shell theme={null} pip3 install ai-edge-litert Pillow opencv-python ``` ### Install GStreamer and GTK dependencies These packages are required for video display output via Wayland: ```shell theme={null} apt install -y libgstreamer1.0-dev gstreamer1.0-plugins-ugly gstreamer1.0-libav \ gstreamer1.0-alsa gstreamer1.0-gtk3 python3-gi python3-gi-cairo \ gir1.2-gtk-3.0 python3-full pkg-config cmake libcairo2-dev \ libgirepository1.0-dev gir1.2-glib-2.0 build-essential python3-dev \ pkg-config meson ``` *** ## Step 4: Download the application and model artifacts Still inside the container, set up the object detection application: ### Create the application directory ```shell theme={null} mkdir -p /etc/apps/ && cd /etc/apps/ ``` ### Download the application script ```shell theme={null} curl -L https://raw.githubusercontent.com/qualcomm/sample-apps-for-qualcomm-linux/refs/heads/main/qualcomm-linux/applications/LiteRT/object_detection.py -o /etc/apps/object_detection.py ``` ### Download the model, labels, and sample video ```shell theme={null} curl -L https://raw.githubusercontent.com/qualcomm/sample-apps-for-qualcomm-linux/refs/heads/main/qualcomm-linux/artifacts/labels/coco_labels.txt -o /etc/labels/coco_labels.txt ``` ```shell theme={null} curl -L https://raw.githubusercontent.com/qualcomm/sample-apps-for-qualcomm-linux/refs/heads/main/qualcomm-linux/artifacts/videos/video.mp4 -o /etc/media/video.mp4 ``` ```shell theme={null} curl -L https://huggingface.co/qualcomm/Yolo-X/resolve/v0.30.5/Yolo-X_w8a8.tflite -o /etc/models/yolox_quantized.tflite ``` ### Exit the root shell You need to exit and re-enter the container as the `qimsdk` user to run the application: ```shell theme={null} exit ``` *** ## Step 5: Run the object detection application ### Enter the container as the standard user ```shell theme={null} docker exec -it ${DOCKER_ID} bash ``` ### Activate the Python environment ```shell theme={null} . venv-litert-demo/bin/activate ``` ### Run the application ```shell theme={null} cd /etc/apps ``` Run the application and save the output as a video file: ```shell theme={null} python3 object_detection.py --output file ``` Once processing is complete, retrieve the output video: ```shell theme={null} exit ``` ```shell theme={null} docker cp ${DOCKER_ID}:/etc/apps/output_object_detection.mp4 /etc/media/output_object_detection.mp4 ``` To copy the file to your host machine: ```shell theme={null} scp root@:/etc/media/output_object_detection.mp4 . ``` To stream the output directly to a connected display via Wayland: ```shell theme={null} python3 object_detection.py --output wayland ``` Ensure a display is connected to the device and Wayland is running before using this mode. ***

Code walkthrough: Object detection with OpenCV and LiteRT

This section explains the `object_detection.py` application. Use this as a reference to build custom inference applications with LiteRT on Qualcomm Dragonwing devices. The postprocessing in the following code is designed for object detection models from [Qualcomm AI Hub](https://aihub.qualcomm.com/). For custom models, update the postprocessing logic to match the model's output format and requirements. ### Import packages ```python theme={null} #!/usr/bin/env python3 import cv2 import numpy as np import argparse import ai_edge_litert.interpreter as tflite import gi gi.require_version('Gst', '1.0') from gi.repository import Gst ``` ### Parse output arguments ```python theme={null} parser = argparse.ArgumentParser(description="Run object detection and output to file or Wayland.") parser.add_argument("--output", choices=["file", "wayland"], default="file", help="Choose output mode: 'file' (default) or 'wayland'") args = parser.parse_args() ``` ### Configure model parameters ```python theme={null} MODEL_PATH = "/etc/models/yolox_quantized.tflite" # YOLOX quantized model LABEL_PATH = "/etc/labels/coco_labels.txt" VIDEO_IN = "/etc/media/video.mp4" VIDEO_OUT = "output_object_detection.mp4" DELEGATE_PATH = "libQnnTFLiteDelegate.so" FRAME_W, FRAME_H = 1600, 900 FPS_OUT = 30 CONF_THRES = 0.25 NMS_IOU_THRES = 0.50 BOX_SCALE = 3.2108588218688965 BOX_ZP = 31.0 SCORE_SCALE = 0.0038042240776121616 ``` ### Load the model with the QNN delegate The QNN delegate enables inference on the NPU: ```python theme={null} delegate_options = {'backend_type': 'htp'} delegate = tflite.load_delegate(DELEGATE_PATH, delegate_options) interpreter = tflite.Interpreter(model_path=MODEL_PATH, experimental_delegates=[delegate]) interpreter.allocate_tensors() in_det = interpreter.get_input_details() out_det = interpreter.get_output_details() in_h, in_w = in_det[0]["shape"][1:3] labels = [l.strip() for l in open(LABEL_PATH)] ``` ### Set up video capture and preprocessing ```python theme={null} cap = cv2.VideoCapture(VIDEO_IN) sx, sy = FRAME_W / in_w, FRAME_H / in_h frame_rs = np.empty((FRAME_H, FRAME_W, 3), np.uint8) input_tensor = np.empty((1, in_h, in_w, 3), np.uint8) ``` ### Configure the output pipeline ```python theme={null} if args.output == "file": fourcc = cv2.VideoWriter_fourcc(*"mp4v") out_writer = cv2.VideoWriter(VIDEO_OUT, fourcc, FPS_OUT, (FRAME_W, FRAME_H)) else: Gst.init(None) pipeline = Gst.parse_launch( 'appsrc name=src is-live=true block=true format=time caps=video/x-raw,format=BGR,width=1600,height=900,framerate=30/1 ! videoconvert ! waylandsink' ) appsrc = pipeline.get_by_name('src') pipeline.set_state(Gst.State.PLAYING) frame_cnt = 0 ``` ### Run inference in the main loop Read each video frame, run inference, apply NMS, and draw bounding boxes: ```python theme={null} while True: ok, frame = cap.read() if not ok: break frame_cnt += 1 cv2.resize(frame, (FRAME_W, FRAME_H), dst=frame_rs) cv2.resize(frame_rs, (in_w, in_h), dst=input_tensor[0]) interpreter.set_tensor(in_det[0]['index'], input_tensor) interpreter.invoke() boxes_q = interpreter.get_tensor(out_det[0]['index'])[0] scores_q = interpreter.get_tensor(out_det[1]['index'])[0] classes_q = interpreter.get_tensor(out_det[2]['index'])[0] boxes = BOX_SCALE * (boxes_q.astype(np.float32) - BOX_ZP) scores = SCORE_SCALE * scores_q.astype(np.float32) classes = classes_q.astype(np.int32) mask = scores >= CONF_THRES if np.any(mask): boxes_f = boxes[mask] scores_f = scores[mask] classes_f = classes[mask] x1, y1, x2, y2 = boxes_f.T boxes_cv2 = np.column_stack((x1, y1, x2 - x1, y2 - y1)) idx_cv2 = cv2.dnn.NMSBoxes( bboxes=boxes_cv2.tolist(), scores=scores_f.tolist(), score_threshold=CONF_THRES, nms_threshold=NMS_IOU_THRES ) if len(idx_cv2): idx = idx_cv2.flatten() sel_boxes = boxes_f[idx] sel_scores = scores_f[idx] sel_classes = classes_f[idx] sel_boxes[:, [0, 2]] *= sx sel_boxes[:, [1, 3]] *= sy sel_boxes = sel_boxes.astype(np.int32) sel_boxes[:, [0, 2]] = np.clip(sel_boxes[:, [0, 2]], 0, FRAME_W - 1) sel_boxes[:, [1, 3]] = np.clip(sel_boxes[:, [1, 3]], 0, FRAME_H - 1) for (x1i, y1i, x2i, y2i), sc, cl in zip(sel_boxes, sel_scores, sel_classes): cv2.rectangle(frame_rs, (x1i, y1i), (x2i, y2i), (0, 255, 0), 2) lab = labels[cl] if cl < len(labels) else str(cl) cv2.putText(frame_rs, f"{lab} {sc:.2f}", (x1i, max(10, y1i - 5)), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2) if args.output == "file": out_writer.write(frame_rs) else: data = frame_rs.tobytes() buf = Gst.Buffer.new_allocate(None, len(data), None) buf.fill(0, data) buf.duration = Gst.util_uint64_scale_int(1, Gst.SECOND, FPS_OUT) timestamp = cap.get(cv2.CAP_PROP_POS_MSEC) * Gst.MSECOND buf.pts = buf.dts = int(timestamp) appsrc.emit('push-buffer', buf) ``` ### Clean up resources ```python theme={null} cap.release() if args.output == "file": out_writer.release() print(f"Done - processed video saved to {VIDEO_OUT}") else: appsrc.emit('end-of-stream') pipeline.set_state(Gst.State.NULL) print("Done - video streamed to Wayland sink") ``` *** ## Troubleshooting | Issue | Solution | | -------------------------- | -------------------------------------------------------------------------------------------------------------------------------------- | | `docker pull` fails | Verify the device has internet access. Check DNS settings and proxy configuration. | | QNN delegate fails to load | Ensure the CDI and environment files match your hardware. Verify the container was started with `--device qualcomm.com/device=qimsdk`. | | No video output on display | Confirm Wayland is running and a display is connected. Try the `file` output mode first to verify inference works. | | Model download fails | Check network connectivity. The model is hosted on Hugging Face and may require proxy settings in some environments. | | Low FPS or slow inference | Verify the model is running on the NPU (HTP backend). Check that `backend_type` is set to `'htp'` in the delegate options. | *** ## Next steps * **Try different models**: Replace the YOLOX model with other quantized models from [Qualcomm AI Hub](https://aihub.qualcomm.com/) for tasks like image classification, pose estimation, or segmentation. * **Use live camera input**: Modify the application to use a camera feed instead of a pre-recorded video by changing the `VIDEO_IN` path to a device capture source. * **Tune detection parameters**: Adjust `CONF_THRES` and `NMS_IOU_THRES` to optimize detection accuracy for your use case. * **Build custom applications**: Use the code walkthrough as a template to create your own inference pipelines targeting the Dragonwing NPU.