-
How to debug the sample application using debug logs?
To troubleshoot execution failures when running AI sample applications or gst-launch-1.0, enable debug logs.
To enable debug logs in a GStreamer sample application, use the
GST_DEBUGenvironment variable to set the debug level. TheGST_DEBUGenvironment variable controls the verbosity of the debug output. You can set it to different levels, such as:- 0: None (no debug information)
- 1: ERROR (logs all fatal errors)
- 2: WARNING (logs all warnings)
- 3: FIXME (logs incomplete code paths)
- 4: INFO (logs informational messages)
- 5: DEBUG (logs general debug messages)
- 6: LOG (logs all log messages)
- 7: TRACE (logs trace messages)
- 9: MEMDUMP (logs memory dumps)
If you want to filter debug logs for specific categories, you can specify them in theGST_DEBUGvariable. For example, to enable debug logs for ML inference plugin and FPS, you can use: -
What common issues prevent a quick out-of-the-box experience for AI sample applications?
The out-of-the-box experience for AI sample applications is designed to be very quick. However, the following issues are the most common issues that prevent a quick out-of-the-box experience.
-
Failure to load model file
This issue arises when the model file is either missing or not in the correct format. Copy the model file to the correct path and ensure you are using the same SDK version for model conversion/quantization and on the target device.
-
Failure to deserialize labels
This occurs when the label file specified at the path isn’t present. Ensure that the label file is copied to the device and the specified path is correct.
-
Failure to set module options
This occurs when setting constants for LiteRT/Qualcomm AI Engine direct use cases. See Discover SDKs → IM SDKs for steps to read and set model constants correctly.
-
Failure to load model file
-
How to measure AI sample app profiling?
-
Preprocessing time
Before feeding input to the AI inferencing plugin, the input must be preprocessed (including normalization, rescaling, and color inversion).
This task is managed by the preprocessing plugin,
qtimlvconverter. You can measure the preprocessing time by executing the following command:The preprocessing time is displayed in the logs. -
Model inference time
The Qualcomm Intelligent Multimedia SDK supports three AI inferencing plugins that utilize the Qualcomm Neural Processing SDK, LiteRT, and Qualcomm AI Engine Direct frameworks, respectively.
- qtimlsnpe
- qtimltflite
- qtimlqnn
-
Postprocessing time
The outputs from AI inferencing plugins are handled by postprocessing plugins.
These plugins take the AI model’s results and produce elements that can be overlaid on the input stream or used for further computation.
For example, text boxes for classification use cases, segmentation masks for segmentation use cases, etc.
To measure the processing time for these plugins, use the following command.
The postprocessing time is shown in the logs. The following uses the
qtimlvclassificationplugins as an example.
-
Preprocessing time
Before feeding input to the AI inferencing plugin, the input must be preprocessed (including normalization, rescaling, and color inversion).
This task is managed by the preprocessing plugin,
-
How to measure end-to-end FPS of a use case?
To measure the frames per second (FPS) in a GStreamer pipeline, use the
fpsdisplaysinkelement. This element can display the current and average framerate either as an overlay on the video or by printing it to the console. Sample apps use thefpsdisplaysinkplugin to display the FPS of the pipeline directly on the HDMI monitor. - What’s the easiest way to replace an existing model with a custom model in a reference application? Ensure you are using the correct model and label file for the sample application. Provide model and label filepaths in the respective sample app configuration file. The sample app will use the model and label files from the specified locations. See Discover SDKs → IM SDKs for more information.
-
User replaced another supported model (IMSDK supported) in reference application. How to debug performance and accuracy issues?
To measure performance and accuracy issues of the model, use the AI SDK tools and follow these steps:
-
Performance measurement:
Use the SDK benchmarking tool. For example, If you are using Qualcomm Neural Processing Engine SDK, use snpe-bench-py -
Accuracy debugger:
- AI Hub model:
- Ensure you are using the latest model from AI Hub.
- Correctly populate the constants for your selected model. See Discover SDKs → IM SDKs for more information.
- For further support on AI Hub model accuracy issues, report your issue on Qualcomm AI Hub slack.
- Ensure you are using the latest model from AI Hub.
- Custom model
- Model quantization is a common cause of accuracy drops. Ensure you are using the correct dataset for model quantization. Users are expected to use a portion of dataset which is an approximation of actual deployment environment for Post Training Quantization (PTQ). For PTQ to give good results, users need to feed decent amount of data to quantize the model, for example, approximately 25-30 RAW images.
- Experiment with different model precisions, such as W8A16 and W16A16, to see if there is an improvement in model accuracy.
- Use AIMET for Post-Training Quantization (PTQ) and Quantization-Aware Training (QAT) techniques for advanced quantization. See the AIMET documentation for more details.
- Model quantization is a common cause of accuracy drops. Ensure you are using the correct dataset for model quantization. Users are expected to use a portion of dataset which is an approximation of actual deployment environment for Post Training Quantization (PTQ). For PTQ to give good results, users need to feed decent amount of data to quantize the model, for example, approximately 25-30 RAW images.
- AI Hub model:
-
Performance measurement:
-
What are the best practices for model quantization using AI SDK?
- Prepare Calibration Data Use representative calibration data that closely matches the data the model will encounter in production. This helps accurately determine the scaling factors and zero points for quantization
-
Choose the Right Quantization Method
- Post-Training Quantization (PTQ): This method is simpler and faster. Suitable for models where slight accuracy loss is acceptable. It converts a pretrained floating-point model to a quantized model without retraining.
- Quantization-Aware Training (QAT): This method involves training the model with quantization in mind, which can help maintain higher accuracy. It’s more complex but beneficial for models where accuracy is critical.
- AI SDK provides tools to convert LiteRT to DLC. Do users have to convert LiteRT models to DLC or can LiteRT models accelerate directly on NPU ? You don’t necessarily have to convert LiteRT models to DLC to accelerate them on the NPU. Qualcomm’s AI SDK supports running LiteRT models directly on the NPU using the LiteRT delegate. This means you can leverage NPU capabilities without converting your LiteRT models to DLC format.
- AI SDK provides tools to convert PyTorch, Onnx, and Tensorflow models to DLC. Which of these is the quickest path for deployment ? Converting your models to ONNX first and then to DLC is often the quickest and most flexible approach for deployment.
-
How does a user know, if the model is running on the NPU?
Use the following tools from Qualcomm:
- Qualcomm profiler
-
Sysmon
If you have access to the Hexagon SDK, refer to the sysmon documentation at
<Hexagon_sdk_path>/<version>/tools/sysmon_app.html
-
How do users get the benefit of heterogeneous AI engines of Snapdragon platform? How can users run different AI models on different Hardware cores? (GPU, NPU, etc.)
AI SDK tools and APIs provide options to select the runtime (CPU, GPU or DSP). You can select the appropriate runtime as a command line parameter or use specific C/C++ APIs.
See snpe-net-run for more details.
The AI sample applications, by default, use the DSP runtime. You can change the runtime using sample application configuration.
For example, in the
gst-ai-object-detectionsample app modify theruntimeparameter in theconfig_detection.jsonfile. Runtime:"cpu""gpu""dsp"
- How to proceed if model conversion fails with AI SDK? Contact Qualcomm Support Forums for support.
- A custom floating point model provides good accuracy on CPU, but quantized model accuracy is bad on both CPU and NPU. How to debug this? There could be problems related to model quantization, see user replaced model FAQ for guidance on model quantization
-
A custom quantized model is running as expected on CPU, but the same model isn’t running accurately on NPU. How to debug this?
The
qnn-net-run--debugoption dumps layerwise values, so you can compare between CPU and NPU inference. - When running a sample app, there is no output on the HDMI screen. How to debug this? Follow display debugging to debug HDMI display issues.
- Is it possible to use a HDMI TV instead of HDMI monitor to run sample applications? In most cases it will work. If it does not, use an HDMI monitor instead.
-
How to check hardware runtime for AI sample apps?
Many GStreamer sample applications support inference on various runtimes (CPU, GPU, and DSP). To determine which runtime the app will use for inference, observe the logs.
- CPU
- GPU
- DSP
- CPU
-
What are devtool sanity check errors? How to debug them?
Occasionally, you may see devtool showing sanity check errors.
Ensure you have sudo access for the host computer. If the error persists, do the following workaround:
- Update permissions.
- Disable BitBake sanity checking in the
$ESDK_ROOT/layers/poky/meta/conf/sanity.conffile.
- Update permissions.
-
Steps to sideload the Qualcomm Neural Processing Engine SDK on the target device.
-
Go to the target device shell and run the following commands:
-
Go to the SNPE SDK root folder on the host computer and run the following commands:
The following commands are for QCS6490.
- For QCS8275: Replace
hexagon-v68withhexagon-v75 - For QCS9075: Replace
hexagon-v68withhexagon-v73
- For QCS8275: Replace
-
Validate the new SDK version:
-
Go to the target device shell and run the following commands:
-
Debug steps to take a preprocessed tensor from the QIM SDK and run inference using SNPE.
-
Generate preprocessed raw tensors from a video file.
- Create a folder to save the raw tensors to.
- Run the GStreamer pipeline to dump the raw tensors to the folder.
Adjust the rate (FPS of the input video) and the dimensions (input dimensions of the model) as needed for the video and model you are using. -
Run inference with SNPE or QNN.
i. Create an
input_list.txtfile containing the absolute paths to the raw files, as shown below. Inference is performed on each file listed in theinput_list.txtfile. Example content ininput_list.txt- Run the SNPE DLC or QNN model on the HTP backend.
For more details, see deploy a model using Neural Processing Engine or AI Engine Direct.- SNPE
- QNN
-
Generate preprocessed raw tensors from a video file.
Frequently Asked Questions (FAQ)
Answers to common questions about running AI sample applications, debugging, profiling, model quantization, and hardware runtimes on Qualcomm Dragonwing IoT platforms.

