> ## Documentation Index
> Fetch the complete documentation index at: https://dragonwingdocs.qualcomm.com/llms.txt
> Use this file to discover all available pages before exploring further.

# AI Workflow Overview

> Overview of supported AI workflows for training, converting, and deploying models on Qualcomm Dragonwing devices.

This section outlines a modular, hands-on approach to AI development using Qualcomm®-supported tools, runtimes, and frameworks.\
Whether you're training models, deploying pre-trained networks, or building multimodal AI workflows, this guide offers a modular, hands-on approach.

The document covers:

* Model creation and training with **Edge Impulse and Qualcomm® AI Hub**
* Model conversion and inference using **TensorFlow, LiteRT / TensorFlow Lite**, and **ONNX Runtime** with NPU acceleration
* Running optimized AI models via **context binaries (.bin) and DLC (.dlc)** files using Qualcomm AI tools
* Local execution of large language and vision-language models using **Llama.cpp**
* Deployment of LLM/VLM workloads using a **containerized OpenAI‑compatible API service**
* Workflow orchestration and multimodal AI pipelines with **Qualcomm® Genie**
* Speech transcription, translation, and language identification using **Whisper** on NPU or CPU
* Sample applications and vision pipelines using **Qualcomm® IMSDK**
* Robotics and intelligent system development using **Qualcomm® QIRP SDK**

Each section is designed to be standalone, so you can jump directly into the tools and flows that match your project needs. The goal is to provide clear, reusable examples and practical insights for integrating AI into real-world edge applications.

## Choose your journey

Use the flowchart below to find the path that fits your application. It walks you through whether you already have a model, how to prepare it, and how to run inference, then links you to the right Ubuntu workflow page. Every highlighted box is clickable.

```mermaid theme={null}
%%{init: {"flowchart": {"rankSpacing": 110, "nodeSpacing": 65}} }%%
flowchart TD
    Q1{"Do you already have<br/>a trained model?"}

    Q1 -->|"No, just show me what it can do"| RunPrebuilt["Run prebuilt AI models & apps<br/>Qdemo UI · prebuilt samples · IM SDK"]
    Q1 -->|"I want to fine tune a model<br/>with my own data"| EI["Edge Impulse"]
    Q1 -->|"Yes / bring my own<br/>ONNX · PyTorch · TF · LiteRT"| GenQ{"Generative AI?<br/>LLM · VLM"}

    GenQ -->|"Yes"| GenAI["On-device GenAI workflows<br/>Genie · Llama.cpp · OpenAI-compatible container"]
    GenQ -->|"No, classic ML / vision"| Prep{"Prepare the model"}

    Prep -->|"Ready model, or BYOM in the cloud"| AIHub["Qualcomm AI Hub<br/>ready models · BYOM<br/>(optimize in the cloud)"]
    Prep -->|"Use optimized Qualcomm model files"| Context["Context binaries / DLC<br/>QAI AppBuilder"]
    Prep -->|"Recover quantized accuracy"| AIMET["AIMET (PTQ / QAT)"]

    AIHub --> Runtime{"Run inference"}
    Context --> Runtime
    AIMET --> Runtime

    Runtime -->|"Python / quick .tflite"| LiteRT["LiteRT / TFLite<br/>AI Engine Direct delegate"]
    Runtime -->|"ONNX models"| ONNXRun["ONNX Runtime<br/>AI Engine Direct"]
    Runtime -->|"Optimized .bin / .dlc files"| ContextRun["Context binaries + QAI AppBuilder"]
    Runtime -->|"Camera / video + AI pipeline"| IMSDK["Qualcomm IM SDK<br/>GStreamer · zero-copy · GPU pre/post"]

    click RunPrebuilt "/Ubuntu/sample-applications/overview"
    click GenAI "/Ubuntu/ai-workflows/genie"
    click AIHub "/Ubuntu/ai-workflows/ai-hub"
    click Context "/Ubuntu/ai-workflows/context-binaries"
    click AIMET "https://quic.github.io/aimet-pages/releases/latest/index.html"
    click EI "/Ubuntu/ai-workflows/edge-impulse"
    click LiteRT "/Ubuntu/ai-workflows/lite-rt"
    click ONNXRun "/Ubuntu/ai-workflows/onnxruntime"
    click ContextRun "/Ubuntu/ai-workflows/context-binaries"
    click IMSDK "/Ubuntu/ai-workflows/im-sdk"

    classDef prep fill:#31017D,stroke:#31017D,color:#fff;
    classDef run fill:#3253DC,stroke:#3253DC,color:#fff;
    class AIHub,Context,AIMET,EI,GenAI prep;
    class RunPrebuilt,LiteRT,ONNXRun,ContextRun,IMSDK run;
```

<Note>
  On-device generative AI availability depends on your Dragonwing development board and model. Start with [LLMs using Genie](/Ubuntu/ai-workflows/genie), and use [Llama.cpp](/Ubuntu/ai-workflows/llama-cpp) as a fallback where Genie model support is not available.
</Note>

## Application Development & Execution Flow Summary

| Flow                                           | Purpose                                                                                                                                                                                                                                                              |
| ---------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [**Edge Impulse**](./edge-impulse)             | Build and train AI models using audio, image and other sensor data - or bringing your own model in a variety of formats.                                                                                                                                             |
| [**Qualcomm® AI Hub**](./ai-hub)               | Qualcomm® AI Hub simplifies deploying AI models for vision, audio, and speech applications to edge devices. You can optimize, validate, and deploy your own AI models on hosted Qualcomm platform devices within minutes.                                            |
| [**Convert TensorFlow models**](./tensorflow)  | Quantize and convert TensorFlow/Keras models (.keras, .h5) to `.tflite` format for NPU deployment.                                                                                                                                                                   |
| [**Run LiteRT/TFLite models**](./lite-rt)      | Execute `.tflite` models on the NPU (Python or C++) using AI Engine Direct delegates. Works with models from TensorFlow, AI Hub, or Edge Impulse.                                                                                                                    |
| [**ONNX**](./onnxruntime)                      | ONNX enables cross-platform AI deployment by exporting models. On Dragonwing devices, ONNX Runtime with AI Engine Direct allows execution on the NPU for maximum performance.                                                                                        |
| [**Run Context Binaries**](./context-binaries) | Context binaries (.bin) and .dlc files are used by Qualcomm AI tools such as Genie, VoiceAI ASR, and QAI AppBuilder to run optimized AI models efficiently on target hardware.                                                                                       |
| [**Llama.cpp**](./llama-cpp)                   | Execute large language models locally using a C++ backend optimized for GPU and quantized formats.                                                                                                                                                                   |
| [**Qualcomm® Genie**](./genie)                 | Orchestrate AI microservices and multimodal workflows using Qualcomm’s generative AI runtime.                                                                                                                                                                        |
| [**Whisper**](./whisper)                       | Enables speech transcription, translation, and language identification on Dragonwing using NPU (VoiceAI ASR) or CPU (whisper.cpp).                                                                                                                                   |
| [**Qualcomm® IMSDK**](./im-sdk)                | Qualcomm IMSDK is a multimedia and AI SDK for building high-performance vision pipelines on Qualcomm Linux platforms.It includes GStreamer plugins, AI runtime integration, and messaging support to accelerate robotics, surveillance, and embedded AI development. |