> ## Documentation Index
> Fetch the complete documentation index at: https://dragonwingdocs.qualcomm.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Overview

> Build AI/ML applications using GStreamer and the Qualcomm Intelligent Multimedia SDK (IM SDK): how zero-copy pipelines and GPU pre/post-processing work, and where the full reference lives.

The Qualcomm® Intelligent Multimedia SDK (IM SDK) is a unified, hardware-accelerated framework for building multimedia and AI/ML applications on Qualcomm Dragonwing IoT platforms. It is built on [GStreamer](https://gstreamer.freedesktop.org/) and provides 40+ Qualcomm-optimized plugins that compose into single- or multi-stream **camera → AI → display/stream** pipelines, without requiring deep knowledge of the underlying hardware.

Use the IM SDK when your application is a **real-time camera, video, or vision pipeline**: it combines capture, preprocessing, inference, and rendering into one pipeline and runs each stage on the most efficient engine.

<Note>
  The IM SDK ships as part of the Qualcomm Linux image, so no separate install is required for the AI Developer Workflow. For the complete SDK reference (installation alternatives, the full plugin catalog, pipeline-building guides, and sample applications), see **[Discover SDKs → IM SDKs](https://imsdkdocs.qualcomm.com)**.
</Note>

## Why use the IM SDK

<CardGroup cols={2}>
  <Card title="One unified multimedia + AI framework" icon="arrows-to-circle">
    Capture, preprocess, inference, post-process, and render all live in a single GStreamer pipeline, not separate apps glued together.
  </Card>

  <Card title="Hardware-accelerated by default" icon="rabbit-running">
    Plugins natively use the Qualcomm CPU, GPU, NPU (HTP), camera ISP, video (VPU), and DSP, so each stage runs on the right engine.
  </Card>

  <Card title="Zero-copy (zero-memcpy) buffers" icon="bolt">
    Frames move through the pipeline as DMA-buf handles, so the same buffer is shared across ISP, GPU, and NPU with no CPU copies.
  </Card>

  <Card title="GPU pre- and post-processing" icon="microchip">
    Resize, color conversion, tensor layout, and overlay rendering run on the Adreno GPU, keeping the CPU free and inference fed.
  </Card>

  <Card title="Flexible AI runtime support" icon="microchip-ai">
    Run models through LiteRT/TFLite, ONNX, Qualcomm AI Engine Direct (QNN), or SNPE. Choose per model without rewriting the pipeline.
  </Card>

  <Card title="Qualcomm AI Hub integration" icon="cloud-arrow-down">
    Download pre-trained, quantized models from [Qualcomm AI Hub](https://aihub.qualcomm.com/) and run them as-is in a reference pipeline.
  </Card>
</CardGroup>

## How an IM SDK pipeline works

A typical vision-AI pipeline has five stages. The IM SDK accelerates the highlighted (⚡) stages on dedicated hardware and passes frames between them as zero-copy DMA-buf handles.

```mermaid theme={null}
flowchart LR
    classDef gpu fill:#31017D,color:#fff,stroke:#31017D
    classDef npu fill:#3253DC,color:#fff,stroke:#3253DC
    classDef io  fill:#5B6770,color:#fff,stroke:#5B6770

    Cam["Data source<br/>qticamsrc · v4l2 decode · RTSP/file"]:::io
    Pre["⚡ Preprocessing: GPU<br/>qtimlvconverter<br/>resize · color convert · tensor layout"]:::gpu
    Inf["⚡ Inference: NPU/HTP<br/>qtimltflite · qtimlqnn<br/>qtimlonnx · qtimlsnpe"]:::npu
    Post["⚡ Post-processing: GPU<br/>qtimlpostprocess + qtivoverlay"]:::gpu
    Out["Use AI metadata<br/>display · RTSP/WebRTC · MQTT/Kafka · actions"]:::io

    Cam -->|"DMA-buf (zero-copy)"| Pre -->|"DMA-buf"| Inf -->|"tensors"| Post -->|"DMA-buf"| Out
```

| Stage                      | IM SDK plugin(s)                                                                                    | Hardware engine             |
| -------------------------- | --------------------------------------------------------------------------------------------------- | --------------------------- |
| Capture / decode           | `qticamsrc`, `v4l2h264dec` / `v4l2h265dec`                                                          | Camera ISP, Video (VPU)     |
| Preprocessing              | `qtimlvconverter`                                                                                   | GPU (Adreno)                |
| Inference                  | `qtimltflite` (LiteRT), `qtimlqnn` (AI Engine Direct / QNN), `qtimlonnx` (ONNX), `qtimlsnpe` (SNPE) | NPU/HTP (Hexagon), CPU, GPU |
| Post-processing            | `qtimlpostprocess`                                                                                  | CPU / GPU                   |
| Overlay / compose / encode | `qtivoverlay`, `qtivcomposer`, `v4l2h264enc`                                                        | GPU, Video (VPU)            |

For a stage-by-stage guide to building these pipelines, see [Building AI pipelines](https://imsdkdocs.qualcomm.com/qimsdk-overview/sdkoverview) and the [plugin reference](https://imsdkdocs.qualcomm.com/plugin-reference/introduction) in Discover SDKs.

### Why zero-copy (zero-memcpy) matters

In a naive pipeline, every stage copies the frame into its own buffer, burning memory bandwidth and CPU cycles, and adding latency. The IM SDK instead keeps **one frame buffer moving** across decode → preprocess → inference → overlay → encode as a shared **DMA-buf** handle:

* The camera ISP writes a frame once; the GPU and NPU read the **same physical buffer**, with no `memcpy` between engines.
* Plugins negotiate DMA-buf allocation automatically through GStreamer's allocation-query mechanism; zero-copy is enabled when every stage in the pipeline supports it (for example, `capture-io-mode=dmabuf` / `output-io-mode=dmabuf-import` on decode/encode elements).
* The result is **lower memory traffic, lower CPU load, and lower end-to-end latency**, which matters most for high-resolution and multi-stream workloads.

### Pre- and post-processing on the GPU

Inference on the NPU is only fast if the NPU is never waiting on the CPU. The IM SDK runs the data-shaping work on the Adreno GPU:

* **Preprocessing**: `qtimlvconverter` performs color-space conversion and tensor-layout transforms directly on the GPU, producing the exact tensor the model expects without a CPU bottleneck before inference.
* **Post-processing**: `qtimlpostprocess` decodes the model's output tensors into structured metadata (bounding boxes, labels, masks), and `qtivoverlay` draws that metadata back onto the frame in-place on the GPU.

## Next steps

<CardGroup cols={2}>
  <Card title="Download source code for development" icon="download" href="../topic/download-source-code">
    Set up the Qualcomm IM SDK extensible SDK (eSDK) and download the source code to develop AI/ML application and plugin code.
  </Card>

  <Card title="Integrate a custom AI model in an application" icon="puzzle-piece" href="../topic/integrate-custom-model">
    Choose the right Qualcomm SDK path to deploy a custom AI model in an application using the Qualcomm IM SDK or the Qualcomm AI Runtime SDK.
  </Card>

  <Card title="Add postprocessing support for a custom model" icon="sliders" href="../topic/add-postprocessing-support-custom-model">
    Add custom model postprocessing support to a Qualcomm IM SDK pipeline using the qtimlpostprocess plugin.
  </Card>
</CardGroup>
