> ## Documentation Index
> Fetch the complete documentation index at: https://dragonwingdocs.qualcomm.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Run a LiteRT model on the NPU

> Run an object detection application on the NPU of a Qualcomm Dragonwing device using the QNN delegate with Python.

## Summary

Use the [LiteRT](https://ai.google.dev/edge/litert) runtime with the QNN delegate to accelerate AI inference on the NPU of Qualcomm® Dragonwing™ devices. This guide demonstrates the end-to-end workflow by deploying a quantized object detection model (YOLOX) that processes video input and outputs annotated frames with bounding boxes — either saved to a file or streamed to a display.

**What you'll learn:**

* Configure a Dragonwing device and deploy the QIM SDK Docker environment
* Run a pre-built object detection application accelerated on the NPU
* Understand the application code to adapt it for your own models and use cases

***

## Prerequisites

Ensure you have the following before proceeding:

| Requirement      | Details                                             |
| ---------------- | --------------------------------------------------- |
| **Hardware**     | Qualcomm® Dragonwing™ device with NPU support       |
| **Host machine** | Linux or macOS with SSH client and Docker support   |
| **Network**      | Wi-Fi or Ethernet connectivity on the target device |
| **Software**     | Docker installed on the target device               |

***

## Step 1: Configure the device

### Enable Wi-Fi and SSH

The device requires an internet connection to download artifacts needed for the sample application. If SSH and Wi-Fi are already configured, skip this step.

Follow [Set up an SSH connection](https://dragonwingdocs.qualcomm.com/Technologies/Ethernet/get-started-with-ethernet#set-up-an-ssh-connection) to enable Wi-Fi and SSH on the device.

### Enable camera support (CamX)

If you plan to use camera input, enable CamX on the platform:

```shell theme={null}
echo -n "camx" > /var/data
efivar -n 882f8c2b-9646-435f-8de5-f208ff80c1bd-VendorDtbOverlays -w -f /var/data
efivar -n 882f8c2b-9646-435f-8de5-f208ff80c1bd-VendorDtbOverlays -p
sync
reboot
```

<Note>
  The device will reboot after this step. Wait for it to come back online before continuing.
</Note>

***

## Step 2: Set up the Docker environment

### Pull the QIM SDK container image

On the target device, pull the latest QIM SDK Docker image:

```shell theme={null}
cd $HOME
```

```shell theme={null}
docker pull artifacts.codelinaro.org/iot-solutions-microservices/qimsdk:latest
```

### Create required directories

Create directories for storing artifacts, configuration files, models, and media:

```shell theme={null}
mkdir -p /etc/cdi /etc/docker/env /etc/models /etc/labels /etc/media /root/media /root/models /root/labels /root/configs
```

### Clone the SDK tools repository

On your **host machine**, clone the QIM SDK Debian repository:

```shell theme={null}
git clone https://git.codelinaro.org/clo/le/sdk-tools.git -b imsdk-tools.lnx.1.0
cd sdk-tools/qimsdk-debian/
```

### Copy configuration files to the device

Transfer the CDI and environment files from your host machine to the target device:

```shell theme={null}
scp -r cdi/<hardware>_qli_2x_qimsdk.json root@<IP_ADDRESS>:/etc/cdi/qimsdk.json
```

```shell theme={null}
scp -r env/<hardware>_qli_2x_qimsdk.env root@<IP_ADDRESS>:/etc/docker/env/qimsdk.env
```

<Note>
  Replace `<hardware>` with the appropriate identifier for your target device (check the repository for available options) and `<IP_ADDRESS>` with your device's IP address.<br /><br />
  For instance, if the target device is Qualcomm Dragonwing™ RB3 Gen 2, then replace `<hardware>` with qcs6490.
</Note>

### Start the container

Launch the QIM SDK container on the target device:

```shell theme={null}
docker run -it -d \
   --net host \
   --env-file /etc/docker/env/qimsdk.env \
   --device qualcomm.com/device=qimsdk \
   -h qimsdk \
   --name qimsdk \
   artifacts.codelinaro.org/iot-solutions-microservices/qimsdk:latest
```

### Access the container as root

```shell theme={null}
export DOCKER_ID=$(docker ps -aq)
docker exec -it ${DOCKER_ID} sh
```

<Note>
  To verify you are logged in as root, run `whoami` inside the container. The output should be `root`.
</Note>

***

## Step 3: Install dependencies

Inside the container (as root), install the LiteRT runtime and required packages.

### Install Python tooling

```shell theme={null}
apt update
apt install python3-pip python3-venv
```

### Create a virtual environment and install Python packages

```shell theme={null}
python3 -m venv venv-litert-demo --system-site-packages
```

```shell theme={null}
. venv-litert-demo/bin/activate
```

```shell theme={null}
pip3 install ai-edge-litert Pillow opencv-python
```

### Install GStreamer and GTK dependencies

These packages are required for video display output via Wayland:

```shell theme={null}
apt install -y libgstreamer1.0-dev gstreamer1.0-plugins-ugly gstreamer1.0-libav \
              gstreamer1.0-alsa gstreamer1.0-gtk3 python3-gi python3-gi-cairo \
              gir1.2-gtk-3.0 python3-full pkg-config cmake libcairo2-dev \
              libgirepository1.0-dev gir1.2-glib-2.0 build-essential python3-dev \
              pkg-config meson
```

***

## Step 4: Download the application and model artifacts

Still inside the container, set up the object detection application:

### Create the application directory

```shell theme={null}
mkdir -p /etc/apps/ && cd /etc/apps/
```

### Download the application script

```shell theme={null}
curl -L https://raw.githubusercontent.com/qualcomm/sample-apps-for-qualcomm-linux/refs/heads/main/qualcomm-linux/applications/LiteRT/object_detection.py -o /etc/apps/object_detection.py
```

### Download the model, labels, and sample video

```shell theme={null}
curl -L https://raw.githubusercontent.com/qualcomm/sample-apps-for-qualcomm-linux/refs/heads/main/qualcomm-linux/artifacts/labels/coco_labels.txt -o /etc/labels/coco_labels.txt
```

```shell theme={null}
curl -L https://raw.githubusercontent.com/qualcomm/sample-apps-for-qualcomm-linux/refs/heads/main/qualcomm-linux/artifacts/videos/video.mp4 -o /etc/media/video.mp4
```

```shell theme={null}
curl -L https://huggingface.co/qualcomm/Yolo-X/resolve/v0.30.5/Yolo-X_w8a8.tflite -o /etc/models/yolox_quantized.tflite
```

### Exit the root shell

You need to exit and re-enter the container as the `qimsdk` user to run the application:

```shell theme={null}
exit
```

***

## Step 5: Run the object detection application

### Enter the container as the standard user

```shell theme={null}
docker exec -it ${DOCKER_ID} bash
```

### Activate the Python environment

```shell theme={null}
. venv-litert-demo/bin/activate
```

### Run the application

```shell theme={null}
cd /etc/apps
```

<Tabs>
  <Tab title="Output to file">
    Run the application and save the output as a video file:

    ```shell theme={null}
    python3 object_detection.py --output file
    ```

    Once processing is complete, retrieve the output video:

    ```shell theme={null}
    exit
    ```

    ```shell theme={null}
    docker cp ${DOCKER_ID}:/etc/apps/output_object_detection.mp4 /etc/media/output_object_detection.mp4
    ```

    To copy the file to your host machine:

    ```shell theme={null}
    scp root@<IP_ADDRESS>:/etc/media/output_object_detection.mp4 .
    ```
  </Tab>

  <Tab title="Output to display">
    To stream the output directly to a connected display via Wayland:

    ```shell theme={null}
    python3 object_detection.py --output wayland
    ```

    <Note>
      Ensure a display is connected to the device and Wayland is running before using this mode.
    </Note>
  </Tab>
</Tabs>

***

<h2 id="create-object_detection-py">
  Code walkthrough: Object detection with OpenCV and LiteRT
</h2>

This section explains the `object_detection.py` application. Use this as a reference to build custom inference applications with LiteRT on Qualcomm Dragonwing devices.

<Note>
  The postprocessing in the following code is designed for object detection models from [Qualcomm AI Hub](https://aihub.qualcomm.com/).
  For custom models, update the postprocessing logic to match the model's output format and requirements.
</Note>

### Import packages

```python theme={null}
#!/usr/bin/env python3
import cv2
import numpy as np
import argparse
import ai_edge_litert.interpreter as tflite
import gi
gi.require_version('Gst', '1.0')
from gi.repository import Gst
```

### Parse output arguments

```python theme={null}
parser = argparse.ArgumentParser(description="Run object detection and output to file or Wayland.")
parser.add_argument("--output", choices=["file", "wayland"], default="file",
                    help="Choose output mode: 'file' (default) or 'wayland'")
args = parser.parse_args()
```

### Configure model parameters

```python theme={null}
MODEL_PATH = "/etc/models/yolox_quantized.tflite"  # YOLOX quantized model
LABEL_PATH = "/etc/labels/coco_labels.txt"
VIDEO_IN = "/etc/media/video.mp4"
VIDEO_OUT = "output_object_detection.mp4"
DELEGATE_PATH = "libQnnTFLiteDelegate.so"

FRAME_W, FRAME_H = 1600, 900
FPS_OUT = 30
CONF_THRES = 0.25
NMS_IOU_THRES = 0.50
BOX_SCALE = 3.2108588218688965
BOX_ZP = 31.0
SCORE_SCALE = 0.0038042240776121616
```

### Load the model with the QNN delegate

The QNN delegate enables inference on the NPU:

```python theme={null}
delegate_options = {'backend_type': 'htp'}
delegate = tflite.load_delegate(DELEGATE_PATH, delegate_options)
interpreter = tflite.Interpreter(model_path=MODEL_PATH, experimental_delegates=[delegate])
interpreter.allocate_tensors()

in_det = interpreter.get_input_details()
out_det = interpreter.get_output_details()
in_h, in_w = in_det[0]["shape"][1:3]

labels = [l.strip() for l in open(LABEL_PATH)]
```

### Set up video capture and preprocessing

```python theme={null}
cap = cv2.VideoCapture(VIDEO_IN)
sx, sy = FRAME_W / in_w, FRAME_H / in_h
frame_rs = np.empty((FRAME_H, FRAME_W, 3), np.uint8)
input_tensor = np.empty((1, in_h, in_w, 3), np.uint8)
```

### Configure the output pipeline

```python theme={null}
if args.output == "file":
    fourcc = cv2.VideoWriter_fourcc(*"mp4v")
    out_writer = cv2.VideoWriter(VIDEO_OUT, fourcc, FPS_OUT, (FRAME_W, FRAME_H))
else:
    Gst.init(None)
    pipeline = Gst.parse_launch(
        'appsrc name=src is-live=true block=true format=time caps=video/x-raw,format=BGR,width=1600,height=900,framerate=30/1 ! videoconvert ! waylandsink'
    )
    appsrc = pipeline.get_by_name('src')
    pipeline.set_state(Gst.State.PLAYING)

frame_cnt = 0
```

### Run inference in the main loop

Read each video frame, run inference, apply NMS, and draw bounding boxes:

```python theme={null}
while True:
    ok, frame = cap.read()
    if not ok:
        break
    frame_cnt += 1

    cv2.resize(frame, (FRAME_W, FRAME_H), dst=frame_rs)
    cv2.resize(frame_rs, (in_w, in_h), dst=input_tensor[0])

    interpreter.set_tensor(in_det[0]['index'], input_tensor)
    interpreter.invoke()

    boxes_q = interpreter.get_tensor(out_det[0]['index'])[0]
    scores_q = interpreter.get_tensor(out_det[1]['index'])[0]
    classes_q = interpreter.get_tensor(out_det[2]['index'])[0]

    boxes = BOX_SCALE * (boxes_q.astype(np.float32) - BOX_ZP)
    scores = SCORE_SCALE * scores_q.astype(np.float32)
    classes = classes_q.astype(np.int32)

    mask = scores >= CONF_THRES
    if np.any(mask):
        boxes_f = boxes[mask]
        scores_f = scores[mask]
        classes_f = classes[mask]

        x1, y1, x2, y2 = boxes_f.T
        boxes_cv2 = np.column_stack((x1, y1, x2 - x1, y2 - y1))

        idx_cv2 = cv2.dnn.NMSBoxes(
            bboxes=boxes_cv2.tolist(),
            scores=scores_f.tolist(),
            score_threshold=CONF_THRES,
            nms_threshold=NMS_IOU_THRES
        )

        if len(idx_cv2):
            idx = idx_cv2.flatten()
            sel_boxes = boxes_f[idx]
            sel_scores = scores_f[idx]
            sel_classes = classes_f[idx]

            sel_boxes[:, [0, 2]] *= sx
            sel_boxes[:, [1, 3]] *= sy
            sel_boxes = sel_boxes.astype(np.int32)

            sel_boxes[:, [0, 2]] = np.clip(sel_boxes[:, [0, 2]], 0, FRAME_W - 1)
            sel_boxes[:, [1, 3]] = np.clip(sel_boxes[:, [1, 3]], 0, FRAME_H - 1)

            for (x1i, y1i, x2i, y2i), sc, cl in zip(sel_boxes, sel_scores, sel_classes):
                cv2.rectangle(frame_rs, (x1i, y1i), (x2i, y2i), (0, 255, 0), 2)
                lab = labels[cl] if cl < len(labels) else str(cl)
                cv2.putText(frame_rs, f"{lab} {sc:.2f}", (x1i, max(10, y1i - 5)),
                            cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)

    if args.output == "file":
        out_writer.write(frame_rs)
    else:
        data = frame_rs.tobytes()
        buf = Gst.Buffer.new_allocate(None, len(data), None)
        buf.fill(0, data)
        buf.duration = Gst.util_uint64_scale_int(1, Gst.SECOND, FPS_OUT)
        timestamp = cap.get(cv2.CAP_PROP_POS_MSEC) * Gst.MSECOND
        buf.pts = buf.dts = int(timestamp)
        appsrc.emit('push-buffer', buf)
```

### Clean up resources

```python theme={null}
cap.release()
if args.output == "file":
    out_writer.release()
    print(f"Done - processed video saved to {VIDEO_OUT}")
else:
    appsrc.emit('end-of-stream')
    pipeline.set_state(Gst.State.NULL)
    print("Done - video streamed to Wayland sink")
```

***

## Troubleshooting

| Issue                      | Solution                                                                                                                               |
| -------------------------- | -------------------------------------------------------------------------------------------------------------------------------------- |
| `docker pull` fails        | Verify the device has internet access. Check DNS settings and proxy configuration.                                                     |
| QNN delegate fails to load | Ensure the CDI and environment files match your hardware. Verify the container was started with `--device qualcomm.com/device=qimsdk`. |
| No video output on display | Confirm Wayland is running and a display is connected. Try the `file` output mode first to verify inference works.                     |
| Model download fails       | Check network connectivity. The model is hosted on Hugging Face and may require proxy settings in some environments.                   |
| Low FPS or slow inference  | Verify the model is running on the NPU (HTP backend). Check that `backend_type` is set to `'htp'` in the delegate options.             |

***

## Next steps

* **Try different models**: Replace the YOLOX model with other quantized models from [Qualcomm AI Hub](https://aihub.qualcomm.com/) for tasks like image classification, pose estimation, or segmentation.
* **Use live camera input**: Modify the application to use a camera feed instead of a pre-recorded video by changing the `VIDEO_IN` path to a device capture source.
* **Tune detection parameters**: Adjust `CONF_THRES` and `NMS_IOU_THRES` to optimize detection accuracy for your use case.
* **Build custom applications**: Use the code walkthrough as a template to create your own inference pipelines targeting the Dragonwing NPU.
