> ## Documentation Index
> Fetch the complete documentation index at: https://dragonwingdocs.qualcomm.com/llms.txt
> Use this file to discover all available pages before exploring further.

# qtimlsnpe

> neural network models using Qualcomm’s SNPE

<Note>
  qtimlsnpe is only available in `qcom-multimedia-proprietary-image`  <br />
  For more information on QLI images refer to [Qualcomm Linux release](https://qualcomm-staging.mintlify.app/Key-Documents/Yocto-Guide/qualcomm-linux-yocto-overview#qualcomm-linux-release)
</Note>

## Overview

qtimlsnpe executes neural network models using Qualcomm’s Snapdragon Neural Processing Engine (SNPE). Models are packaged as DLC files and, once loaded, the runtime exposes the model’s input and output signature (tensor count, shapes, and element types). SNPE provides multiple execution targets—including CPU, NSP, GPU (Adreno), AIP —so the same model can be deployed across hardware with different performance, latency, and power characteristics. qtimlsnpe surfaces SNPE’s key runtime controls as simple tunables: a delegate selector (CPU/DSP/GPU) to choose the preferred target, performance profiles to trade power for throughput or latency, optional profiling levels for runtime diagnostics, and an execution priority hint. These settings do not change model accuracy by themselves; they help match runtime behavior to your deployment goals. Inputs and outputs flow as neural-network/tensors. The element derives exact tensor shapes and types from the DLC at runtime, ensuring downstream components receive tensors that reflect the model’s declared outputs. When needed by downstream algorithms, outputs can be requested as FLOAT32 even if the model is quantized, enabling dequantization without changing the model artifact.

## Key Responsibilities

qtimlsnpe is responsible for:

* loading and executing an SNPE DLC model on CPU, DSP (Hexagon), GPU (Adreno), or AIP

* accepting preformatted input tensors from upstream elements

* producing output tensors that match the model output signature

* negotiating tensor data types and dimensions with adjacent pipeline elements

* propagating tensor metadata required by downstream elements

* managing buffers through SNPE user buffer mode to reduce unnecessary memory copies

* exposing runtime controls including delegate, performance-profile, profiling-level, and priority

* supporting explicit output filtering by named layers or tensors (order-preserving)

In practice, qtimlsnpe serves as the inference stage in the pipeline, while tensor preparation and result interpretation are handled externally.

## Example Pipeline

<Steps>
  <Step title="Download Required Files">
    | File        | Download                                                                                                                  | Save as                     |
    | ----------- | ------------------------------------------------------------------------------------------------------------------------- | --------------------------- |
    | YOLOX model | [Qualcomm AI hub model](https://aihub.qualcomm.com/iot/models/yolox)                                                      | `yolox_w8a8.dlc`            |
    | YOLO labels | [Yolov8 Labels](https://qimsdk.mintlify.io/labels/yolov8.json)                                                            | `yolov8.json`               |
    | Input video | [Input Video](https://github.com/qualcomm/sample-apps-for-qualcomm-linux/raw/refs/heads/main/artifacts/videos/video.mp4)" | `Draw_1080p_180s_30FPS.mp4` |
  </Step>

  <Step title="Copy files to device">
    <CodeGroup>
      ```bash SCP (SSH) theme={null}
      # Replace $HOME to the appropriate device path before running the commands.
      # For QLI:    /root
      # For Ubuntu: /home/ubuntu
      # Modify this based on your platform and ensure files are copied to the correct location on the device.
      # Run from your host machine — replace <user> and <device-ip>
      ssh <user>@<device-ip> "mkdir -p $HOME/{models,labels,media}"
      scp yolox_w8a8.dlc <user>@<device-ip>:$HOME/models/
      scp yolov8.json <user>@<device-ip>:$HOME/labels/
      scp Draw_1080p_180s_30FPS.mp4 <user>@<device-ip>:$HOME/media/
      ```
    </CodeGroup>
  </Step>

  <Step title="Connect to device">
    <CodeGroup>
      ```bash theme={null}
      # Run from your host machine — replace <user> and <device-ip>
      ssh <user>@<device-ip>
      ```
    </CodeGroup>
  </Step>

  <Step title="Set environment variables">
    Run below command on your device

    ```bash theme={null}
    export MODEL_NAME=yolox_w8a8.dlc
    export LABELS_NAME=yolov8.json
    export SRC_VIDEO_NAME=Draw_1080p_180s_30FPS.mp4
    ```
  </Step>

  <Step title="Run the pipeline">
    ```bash theme={null}
    gst-launch-1.0 -e --gst-debug=2 \
    filesrc location=$HOME/media/$SRC_VIDEO_NAME ! qtdemux ! queue ! h264parse ! v4l2h264dec capture-io-mode=4 output-io-mode=4 ! queue ! tee name=split \
    split. ! queue ! qtivcomposer name=mixer sink_1::dimensions="<1920,1080>" ! queue ! waylandsink fullscreen=true sync=false \
    split. ! queue ! video/x-raw,format=NV12 ! qtimlvconverter ! queue ! qtimlsnpe delegate=dsp tensors="<boxes,scores,class_idx>" model=/$HOME/models/$MODEL_NAME ! queue ! qtimlpostprocess results=10 module=yolov8 labels=/$HOME/labels/$LABELS_NAME settings="{\"confidence\": 70.0}" ! video/x-raw,format=BGRA,width=640,height=640 ! queue ! mixer.
    ```
  </Step>
</Steps>

## Plugin Hierarchy

GObject -> GstObject -> GstElement -> GstBaseTransform -> GstMLSnpe

Here are the tables converted into **MDX-compatible Markdown tables** (you can directly use them in `.mdx` files):

***

# Pad Templates

### sink

| Capabilities             |                                                                                        |
| ------------------------ | -------------------------------------------------------------------------------------- |
| `neural-network/tensors` | `type: { INT8, UINT8, INT16, UINT16, INT32, UINT32, INT64, UINT64, FLOAT16, FLOAT32 }` |

Availability: *Always*\
Direction: *sink*

***

### src

| Capabilities             |                                                                                        |
| ------------------------ | -------------------------------------------------------------------------------------- |
| `neural-network/tensors` | `type: { INT8, UINT8, INT16, UINT16, INT32, UINT32, INT64, UINT64, FLOAT16, FLOAT32 }` |

Availability: *Always*\
Direction: *source*

***

# Element Properties

| Property              | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| --------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `model`               | Path to the SNPE DLC model file.<br /><br />Type: String<br />Default: NULL<br />Flags: readable/writable, construct                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| `delegate`            | Delegate the graph execution to a runtime backend.<br /><br />Type: Enum<br />Default: DEFAULT\_PROP\_DELEGATE<br />Range:<br />    (0): none - CPU execution (fallback always available)<br />    (1): dsp - DSP execution<br />    (2): gpu - GPU execution<br />    (3): aip - AIP execution<br />Flags: readable/writable                                                                                                                                                                                                                                                                                                                                                                         |
| `performance-profile` | Request a performance profile.<br /><br />Type: Enum<br />Default: DEFAULT\_PROP\_PERF\_PROFILE<br />Range:<br />    (0): default - Default performance<br />    (1): balanced - Balanced performance and power<br />    (2): high-performance - Maximum performance<br />    (3): power-saver - Lower power usage<br />    (4): system-settings - System defined behavior<br />    (5): sustained-high-performance - Sustained performance mode<br />    (6): burst - Short bursts of high performance<br />    (7): low-power-saver - Aggressive power saving<br />    (8): high-power-saver - Moderate power saving<br />    (9): low-balanced - Lower balanced mode<br />Flags: readable/writable |
| `profiling-level`     | Set profiling level for runtime statistics.<br /><br />Type: Enum<br />Default: DEFAULT\_PROP\_PROFILING\_LEVEL<br />Range:<br />    (0): off - No profiling<br />    (1): basic - Minimal profiling<br />    (2): moderate - Medium level profiling<br />    (3): detailed - Full profiling<br />Flags: readable/writable                                                                                                                                                                                                                                                                                                                                                                            |
| `priority`            | Execution priority hint for SNPE runtime.<br /><br />Type: Enum<br />Default: DEFAULT\_PROP\_EXEC\_PRIORITY<br />Range:<br />    (0): normal - Default priority<br />    (1): high - Higher priority execution<br />    (2): low - Lower priority execution<br />Flags: readable/writable                                                                                                                                                                                                                                                                                                                                                                                                             |
| `layers`              | List of output layer names.<br /><br />Type: Array of String<br />Default: \[]<br />Note: Mutually exclusive with `tensors`<br />Flags: readable/writable                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| `tensors`             | List of output tensor names. Outputs follow defined order.<br /><br />Type: Array of String<br />Default: \[]<br />Note: Mutually exclusive with `layers`<br />Flags: readable/writable                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |

<Note>Layers vs tensors: Set only one of these. If your model exposes named output tensors, prefer tensors for precise ordering. If both are set sequentially, the last one written takes effect (the other is cleared).</Note>

## Input and Output Behavior

### Input Tensors

qtimlsnpe exposes a single sink pad, but it supports both single-input and batch-input models. For batch-input models, all required tensors are delivered through the same sink pad as a tensor set in a single gstbuffer.

Input tensors must be fully prepared before they reach qtimlsnpe. Expected tensor layout, shape, data type, and batch size are determined by:

* the SNPE DLC model input signature
* caps negotiation with upstream elements

Typical upstream elements include:

* qtimlvconverter for scaling, color conversion, normalization, and quantization (if required).

qtimlsnpe does not modify, reshape, batch, or reinterpret incoming tensors. It maps input tensor blocks into SNPE user buffers and passes them to the SNPE runtime as received.

### Output Tensors

qtimlsnpe exposes a single source pad and produces output tensors that follow the model's declared output signature. This single-pad design does not limit the element to a single output. Models with batch output tensors are fully supported, and all outputs are emitted together on the source pad.

Supported output behavior includes:

* single-tensor and batch-tensor outputs
* arbitrary tensor shapes and ranks, including batch and depth dimensions.
* both quantized and floating-point tensor types
* selective emission of output tensors using the layers or tensors property
* FLOAT32 dequantization: if the model's native output type is not FLOAT32, output caps will include a type list \[FLOAT32, native] to enable downstream negotiation for dequantization without changing the model artifact

The generated output tensors are intended for downstream post-processing stages, which are responsible for decoding model-specific results such as classification outputs, detection results, segmentation masks, landmark data, and other structured inference outputs.

## Delegates

A SNPE **delegate** defines the execution hardware used to run a model. Backends allow `qtimlsnpe` to offload inference from the default CPU interpreter to an optimized hardware accelerator, such as NPU, GPU, AIP.

`qtimlsnpe` supports multiple backend options. The backend is selected through the `backend` property by specifying the path to the corresponding shared library.

### DSP

Runs the model on the AI accelerator (NPU).

* **Use case:** Preferred backend where available. Best performance and power efficiency for quantized models.

### GPU

Runs supported operations through the snpe GPU backend.

* **Use case**: Floating-point models and workloads that benefit from GPU parallelism.

### AIP

Runs supported operations through the snpe AIP backend.

**Use case**:
Hybrid acceleration combining DSP + CPU + other HW blocks

Best for complex models with mixed operator support where pure DSP may fall back frequently

Useful when targeting maximum throughput with balanced power efficiency

Recommended for production pipelines where model partitioning across accelerators is beneficial

### CPU

Runs the model on the default snpe CPU backend.

**Use case**:

Fallback backend when other accelerators (DSP/GPU/AIP) are not available or unsupported

Ideal for debugging, validation, and functional correctness testing

Useful for small models or low-throughput workloads

Works with all model types (quantized + floating point) without operator support limitations

Preferred when deterministic performance and ease of deployment matter more than efficiency

## Profiling Level

Enables SNPE diagnostics collection. Available levels: `off, basic, detailed, moderate`

### Runtime Memory Behavior and GAP Handling

`qtimlsnpe` operates within the memory model of the snpe runtime. The element uses DMA buffers via `GstMLBufferPool` to minimize memory copies and maintain zero-copy transport where possible.

### SNPE Memory Model

SNPE uses runtime-managed memory to allocate:

* input tensors
* intermediate activation tensors
* output tensors

The element discovers input/output tensor metadata(count,shape,type) at model load time and configures buffer pools accordingly.

### GAP Buffer Handling

`qtimlsnpe` is GAP-aware and correctly handles input buffers marked with `GST_BUFFER_FLAG_GAP`.

When a GAP buffer is received, the element skips inference and forwards the buffer downstream. This preserves timing and synchronization while explicitly indicating that no valid inference input is available for that timestamp.

GAP buffers commonly appear in conditional AI pipelines, such as cascaded workflows where later inference stages run only when earlier stages produce valid regions of interest.

## Use cases

### Single-Stage AI Inference on Live Camera Stream (HTP):

<Steps>
  <Step title="Download Required Files">
    | File        | Download                                                                                                                  | Save as                     |
    | ----------- | ------------------------------------------------------------------------------------------------------------------------- | --------------------------- |
    | YOLOX model | [Qualcomm AI hub model](https://aihub.qualcomm.com/iot/models/yolox)                                                      | `yolox_w8a8.dlc`            |
    | YOLO labels | [Yolov8 Labels](https://qimsdk.mintlify.io/labels/yolov8.json)                                                            | `yolov8.json`               |
    | Input video | [Input Video](https://github.com/qualcomm/sample-apps-for-qualcomm-linux/raw/refs/heads/main/artifacts/videos/video.mp4)" | `Draw_1080p_180s_30FPS.mp4` |
  </Step>

  <Step title="Copy files to device">
    <CodeGroup>
      ```bash SCP (SSH) theme={null}
      # Replace $HOME to the appropriate device path before running the commands.
      # For QLI:    /root
      # For Ubuntu: /home/ubuntu
      # Modify this based on your platform and ensure files are copied to the correct location on the device.
      # Run from your host machine — replace <user> and <device-ip>
      ssh <user>@<device-ip> "mkdir -p $HOME/{models,labels,media}"
      scp yolox_w8a8.dlc <user>@<device-ip>:$HOME/models/
      scp yolov8.json <user>@<device-ip>:$HOME/labels/
      scp Draw_1080p_180s_30FPS.mp4 <user>@<device-ip>:$HOME/media/
      ```
    </CodeGroup>
  </Step>

  <Step title="Connect to device">
    <CodeGroup>
      ```bash theme={null}
      # Run from your host machine — replace <user> and <device-ip>
      ssh <user>@<device-ip>
      ```
    </CodeGroup>
  </Step>

  <Step title="Set environment variables">
    Run below command on your device

    ```bash theme={null}
    export MODEL_NAME=yolox_w8a8.dlc
    export LABELS_NAME=yolov8.json
    export SRC_VIDEO_NAME=Draw_1080p_180s_30FPS.mp4
    ```
  </Step>

  <Step title="Run the pipeline">
    ```bash theme={null}
    gst-launch-1.0 -e --gst-debug=2 \
    filesrc location=$HOME/media/$SRC_VIDEO_NAME ! qtdemux ! queue ! h264parse ! v4l2h264dec capture-io-mode=4 output-io-mode=4 ! queue ! tee name=split \
    split. ! queue ! qtivcomposer name=mixer sink_1::dimensions="<1920,1080>" ! queue ! waylandsink fullscreen=true sync=false \
    split. ! queue ! video/x-raw,format=NV12 ! qtimlvconverter ! queue ! qtimlsnpe delegate=dsp tensors="<boxes,scores,class_idx>" model=/$HOME/models/$MODEL_NAME ! queue ! qtimlpostprocess results=10 module=yolov8 labels=/$HOME/labels/$LABELS_NAME settings="{\"confidence\": 70.0}" ! video/x-raw,format=BGRA,width=640,height=640 ! queue ! mixer.
    ```
  </Step>
</Steps>

### Single-Stage AI Inference on Live Camera Stream (GPU):

<Steps>
  <Step title="Download Required Files">
    | File             | Download                                                                                                                 | Save as                            |
    | ---------------- | ------------------------------------------------------------------------------------------------------------------------ | ---------------------------------- |
    | Inception model  | [Qualcomm AI Hub model](https://aihub.qualcomm.com/iot/models/inception_v3)                                              | `inception_v3_float.dlc`           |
    | MobileNet labels | [Mobilenet Labels](https://qimsdk.mintlify.io/labels/mobilenet.json)                                                     | `mobilenet.json`                   |
    | Input video      | [Input Video](https://github.com/qualcomm/sample-apps-for-qualcomm-linux/raw/refs/heads/main/artifacts/videos/video.mp4) | `Animals_000_1080p_180s_30FPS.mp4` |
  </Step>

  <Step title="Copy files to device">
    <CodeGroup>
      ```bash SCP (SSH) theme={null}
      # Replace $HOME to the appropriate device path before running the commands.
      # For QLI:    /root
      # For Ubuntu: /home/ubuntu
      # Modify this based on your platform and ensure files are copied to the correct location on the device.
      # Run from your host machine — replace <user> and <device-ip>
      ssh <user>@<device-ip> "mkdir -p $HOME/{models,labels,media}"
      scp inception_v3_float.dlc <user>@<device-ip>:$HOME/models/
      scp mobilenet.json <user>@<device-ip>:$HOME/labels/
      scp Animals_000_1080p_180s_30FPS.mp4 <user>@<device-ip>:$HOME/media/
      ```
    </CodeGroup>
  </Step>

  <Step title="Connect to device">
    <CodeGroup>
      ```bash theme={null}
      # Run from your host machine — replace <user> and <device-ip>
      ssh <user>@<device-ip>
      ```
    </CodeGroup>
  </Step>

  <Step title="Set environment variables">
    Run below command on your device

    ```bash theme={null}
    export MODEL_NAME=inception_v3_float.dlc
    export LABELS_NAME=mobilenet.json
    export SRC_VIDEO_NAME=Animals_000_1080p_180s_30FPS.mp4
    ```
  </Step>

  <Step title="Run the pipeline">
    ```bash theme={null}
    gst-launch-1.0 -e --gst-debug=2 \
    filesrc location=$HOME/media/$SRC_VIDEO_NAME ! qtdemux ! queue ! h264parse ! v4l2h264dec capture-io-mode=4 output-io-mode=4 ! queue ! tee name=split \
    split. ! queue ! qtivcomposer name=mixer sink_1::dimensions="<1920,1080>" ! queue ! waylandsink fullscreen=true sync=false \
    split. ! queue ! video/x-raw,format=NV12 ! qtimlvconverter ! queue ! qtimlsnpe delegate=gpu tensors="<class_logits>" model=$HOME/models/$MODEL_NAME ! queue ! qtimlpostprocess results=1 module=mobilenet labels=$HOME/labels/$LABELS_NAME settings="{\"confidence\": 51.0}" ! video/x-raw,format=BGRA,width=640,height=640 ! queue ! mixer.
    ```
  </Step>
</Steps>