> ## Documentation Index > Fetch the complete documentation index at: https://dragonwingdocs.qualcomm.com/llms.txt > Use this file to discover all available pages before exploring further. # Overview > Build AI/ML applications using GStreamer and the Qualcomm Intelligent Multimedia SDK (IM SDK): how zero-copy pipelines and GPU pre/post-processing work, and where the full reference lives. The Qualcomm® Intelligent Multimedia SDK (IM SDK) is a unified, hardware-accelerated framework for building multimedia and AI/ML applications on Qualcomm Dragonwing IoT platforms. It is built on [GStreamer](https://gstreamer.freedesktop.org/) and provides 40+ Qualcomm-optimized plugins that compose into single- or multi-stream **camera → AI → display/stream** pipelines, without requiring deep knowledge of the underlying hardware. Use the IM SDK when your application is a **real-time camera, video, or vision pipeline**: it combines capture, preprocessing, inference, and rendering into one pipeline and runs each stage on the most efficient engine. The IM SDK ships as part of the Qualcomm Linux image, so no separate install is required for the AI Developer Workflow. For the complete SDK reference (installation alternatives, the full plugin catalog, pipeline-building guides, and sample applications), see **[Discover SDKs → IM SDKs](https://imsdkdocs.qualcomm.com)**. ## Why use the IM SDK Capture, preprocess, inference, post-process, and render all live in a single GStreamer pipeline, not separate apps glued together. Plugins natively use the Qualcomm CPU, GPU, NPU (HTP), camera ISP, video (VPU), and DSP, so each stage runs on the right engine. Frames move through the pipeline as DMA-buf handles, so the same buffer is shared across ISP, GPU, and NPU with no CPU copies. Resize, color conversion, tensor layout, and overlay rendering run on the Adreno GPU, keeping the CPU free and inference fed. Run models through LiteRT/TFLite, ONNX, Qualcomm AI Engine Direct (QNN), or SNPE. Choose per model without rewriting the pipeline. Download pre-trained, quantized models from [Qualcomm AI Hub](https://aihub.qualcomm.com/) and run them as-is in a reference pipeline. ## How an IM SDK pipeline works A typical vision-AI pipeline has five stages. The IM SDK accelerates the highlighted (⚡) stages on dedicated hardware and passes frames between them as zero-copy DMA-buf handles. ```mermaid theme={null} flowchart LR classDef gpu fill:#31017D,color:#fff,stroke:#31017D classDef npu fill:#3253DC,color:#fff,stroke:#3253DC classDef io fill:#5B6770,color:#fff,stroke:#5B6770 Cam["Data source
qticamsrc · v4l2 decode · RTSP/file"]:::io Pre["⚡ Preprocessing: GPU
qtimlvconverter
resize · color convert · tensor layout"]:::gpu Inf["⚡ Inference: NPU/HTP
qtimltflite · qtimlqnn
qtimlonnx · qtimlsnpe"]:::npu Post["⚡ Post-processing: GPU
qtimlpostprocess + qtivoverlay"]:::gpu Out["Use AI metadata
display · RTSP/WebRTC · MQTT/Kafka · actions"]:::io Cam -->|"DMA-buf (zero-copy)"| Pre -->|"DMA-buf"| Inf -->|"tensors"| Post -->|"DMA-buf"| Out ``` | Stage | IM SDK plugin(s) | Hardware engine | | -------------------------- | --------------------------------------------------------------------------------------------------- | --------------------------- | | Capture / decode | `qticamsrc`, `v4l2h264dec` / `v4l2h265dec` | Camera ISP, Video (VPU) | | Preprocessing | `qtimlvconverter` | GPU (Adreno) | | Inference | `qtimltflite` (LiteRT), `qtimlqnn` (AI Engine Direct / QNN), `qtimlonnx` (ONNX), `qtimlsnpe` (SNPE) | NPU/HTP (Hexagon), CPU, GPU | | Post-processing | `qtimlpostprocess` | CPU / GPU | | Overlay / compose / encode | `qtivoverlay`, `qtivcomposer`, `v4l2h264enc` | GPU, Video (VPU) | For a stage-by-stage guide to building these pipelines, see [Building AI pipelines](https://imsdkdocs.qualcomm.com/qimsdk-overview/sdkoverview) and the [plugin reference](https://imsdkdocs.qualcomm.com/plugin-reference/introduction) in Discover SDKs. ### Why zero-copy (zero-memcpy) matters In a naive pipeline, every stage copies the frame into its own buffer, burning memory bandwidth and CPU cycles, and adding latency. The IM SDK instead keeps **one frame buffer moving** across decode → preprocess → inference → overlay → encode as a shared **DMA-buf** handle: * The camera ISP writes a frame once; the GPU and NPU read the **same physical buffer**, with no `memcpy` between engines. * Plugins negotiate DMA-buf allocation automatically through GStreamer's allocation-query mechanism; zero-copy is enabled when every stage in the pipeline supports it (for example, `capture-io-mode=dmabuf` / `output-io-mode=dmabuf-import` on decode/encode elements). * The result is **lower memory traffic, lower CPU load, and lower end-to-end latency**, which matters most for high-resolution and multi-stream workloads. ### Pre- and post-processing on the GPU Inference on the NPU is only fast if the NPU is never waiting on the CPU. The IM SDK runs the data-shaping work on the Adreno GPU: * **Preprocessing**: `qtimlvconverter` performs color-space conversion and tensor-layout transforms directly on the GPU, producing the exact tensor the model expects without a CPU bottleneck before inference. * **Post-processing**: `qtimlpostprocess` decodes the model's output tensors into structured metadata (bounding boxes, labels, masks), and `qtivoverlay` draws that metadata back onto the frame in-place on the GPU. ## Next steps Set up the Qualcomm IM SDK extensible SDK (eSDK) and download the source code to develop AI/ML application and plugin code. Choose the right Qualcomm SDK path to deploy a custom AI model in an application using the Qualcomm IM SDK or the Qualcomm AI Runtime SDK. Add custom model postprocessing support to a Qualcomm IM SDK pipeline using the qtimlpostprocess plugin.