> ## Documentation Index
> Fetch the complete documentation index at: https://dragonwingdocs.qualcomm.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Run a Generative AI (GenAI) model

> Run a prepared GenAI model on a Qualcomm Dragonwing IoT device using Genie, the Qualcomm AI Runtime SDK, or open-source runtimes.

Once a GenAI model is prepared and optimized for deployment, you can run the model on the target device. Qualcomm platforms offer multiple execution paths to meet different needs, ranging from high-level abstractions to low-level control.

The following table summarizes the GenAI model execution approaches.

| Execution method                           | Details                                                                                                                   | Key considerations                                                                                                                                      |                                                                                                     |                                                                               |                                                              |
| ------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------- | ------------------------------------------------------------ |
| Generative AI Inference Extensions (Genie) | Qualcomm-provided framework designed for simplified execution of complex GenAI models such as LLMs and multimodal models. | Handles orchestration of multiple binaries, memory management, and hardware acceleration across CPU, GPU, and NPU.                                      | Ideal for developers who want plug-and-play deployment with minimal integration effort.             | Optimized for Qualcomm hardware, delivering low latency and power efficiency. | Best for quick deployment and production-ready applications. |
| Qualcomm AI Runtime SDK (QAIRT)            | Provides low-level APIs for executing QAIRT binaries directly.                                                            | Offers fine-grained control for developers who need custom execution flows or profiling.                                                                | Suitable for advanced use cases where performance tuning is critical.                               | Optimized for Qualcomm hardware, delivering low latency and power efficiency. | Best for custom workflows and profiling.                     |
| Open-source runtimes                       | Models can also be executed using open-source runtimes.                                                                   | This approach is useful for developers who want to maintain compatibility with existing open-source ecosystems while leveraging Qualcomm optimizations. | Open-source runtimes provide portability, but may require additional tuning for Qualcomm platforms. | Best for research or hybrid environments.                                     |                                                              |
