Frequently Asked Questions (FAQ) - Qualcomm Dragonwing Documentation

How to debug the sample application using debug logs? To troubleshoot execution failures when running AI sample applications or gst-launch-1.0, enable debug logs. To enable debug logs in a GStreamer sample application, use the GST_DEBUG environment variable to set the debug level. The GST_DEBUG environment variable controls the verbosity of the debug output. You can set it to different levels, such as:
- 0: None (no debug information)
- 1: ERROR (logs all fatal errors)
- 2: WARNING (logs all warnings)
- 3: FIXME (logs incomplete code paths)
- 4: INFO (logs informational messages)
- 5: DEBUG (logs general debug messages)
- 6: LOG (logs all log messages)
- 7: TRACE (logs trace messages)
- 9: MEMDUMP (logs memory dumps)
For example, to set the debug level to ERROR, you can use the following command in your terminal:
```
export GST_DEBUG=2
```
If you want to filter debug logs for specific categories, you can specify them in the GST_DEBUG variable. For example, to enable debug logs for ML inference plugin and FPS, you can use:
```
export GST_DEBUG=qtiml*:fps*:5
```
What common issues prevent a quick out-of-the-box experience for AI sample applications? The out-of-the-box experience for AI sample applications is designed to be very quick. However, the following issues are the most common issues that prevent a quick out-of-the-box experience.
- Failure to load model file
  0:00:00.042355885 4275 0x5579b9f760 ERROR ml-tflite-engine ml-tflite-engine-c-api.cc:578:gst_ml_tflite_engine_new: Failed to load model file '/etc/models/googlenet_quantized.tflite'!
  This issue arises when the model file is either missing or not in the correct format. Copy the model file to the correct path and ensure you are using the same SDK version for model conversion/quantization and on the target device.
- Failure to deserialize labels
  0:00:00.543394063 4676 0x55aca865e0 ERROR mlmodule gstmlmodule.c:301:gst_ml_parse_labels: Failed to deserialize labels!
  This occurs when the label file specified at the path isn’t present. Ensure that the label file is copied to the device and the specified path is correct.
- Failure to set module options
  0:00:00.55129740 4958 0x5561019180 WARN qtimlvclassification mlvclassification.c:986:gst_ml_video_classification_set_caps:<mlvideoclassification0> error: Failed to set module options!
  This occurs when setting constants for LiteRT/Qualcomm AI Engine direct use cases. See Discover SDKs → IM SDKs for steps to read and set model constants correctly.
How to measure AI sample app profiling?
- Preprocessing time Before feeding input to the AI inferencing plugin, the input must be preprocessed (including normalization, rescaling, and color inversion). This task is managed by the preprocessing plugin, qtimlvconverter. You can measure the preprocessing time by executing the following command:
  export GST_DEBUG=qtimlvconverter:6
  The preprocessing time is displayed in the logs.
  LOG qtimlvconverter mlconverter.c:1830:gst_ml_video_converter_transform:<qtimlvconverter> Conversion took 2.743 ms
- Model inference time The Qualcomm Intelligent Multimedia SDK supports three AI inferencing plugins that utilize the Qualcomm Neural Processing SDK, LiteRT, and Qualcomm AI Engine Direct frameworks, respectively.
  - qtimlsnpe
  - qtimltflite
  - qtimlqnn
  These plugins can run inference on various hardware, such as CPU, GPU, and HTP. To determine the AI inference time on the hardware, use the following command.
  export GST_DEBUG=qtiml*:6
  LOG qtimlvtflite mltflite.c:561:gst_ml_tflite_transform:<qtimltflite> Execute took 3.445 ms LOG qtimlvtflite mltflite.c:561:gst_ml_tflite_transform:<qtimltflite> Execute took 3.555 ms
- Postprocessing time The outputs from AI inferencing plugins are handled by postprocessing plugins. These plugins take the AI model’s results and produce elements that can be overlaid on the input stream or used for further computation. For example, text boxes for classification use cases, segmentation masks for segmentation use cases, etc. To measure the processing time for these plugins, use the following command.
  export GST_DEBUG=qtimlv*:6
  The postprocessing time is shown in the logs. The following uses the qtimlvclassification plugins as an example.
  LOG qtimlvclassification mlvclassification.c:1068:gst_ml_video_classification_transform:<qtimlvclassification> Categorization tookExecute took 1.962 ms
How to measure end-to-end FPS of a use case? To measure the frames per second (FPS) in a GStreamer pipeline, use the fpsdisplaysink element. This element can display the current and average framerate either as an overlay on the video or by printing it to the console. Sample apps use the fpsdisplaysink plugin to display the FPS of the pipeline directly on the HDMI monitor.
What’s the easiest way to replace an existing model with a custom model in a reference application? Ensure you are using the correct model and label file for the sample application. Provide model and label filepaths in the respective sample app configuration file. The sample app will use the model and label files from the specified locations. See Discover SDKs → IM SDKs for more information.
User replaced another supported model (IMSDK supported) in reference application. How to debug performance and accuracy issues? To measure performance and accuracy issues of the model, use the AI SDK tools and follow these steps:
- Performance measurement:
  Use the SDK benchmarking tool. For example, If you are using Qualcomm Neural Processing Engine SDK, use snpe-bench-py
- Accuracy debugger:
  - AI Hub model:
    1. Ensure you are using the latest model from AI Hub.
    2. Correctly populate the constants for your selected model. See Discover SDKs → IM SDKs for more information.
    3. For further support on AI Hub model accuracy issues, report your issue on Qualcomm AI Hub slack.
  - Custom model
    1. Model quantization is a common cause of accuracy drops. Ensure you are using the correct dataset for model quantization. Users are expected to use a portion of dataset which is an approximation of actual deployment environment for Post Training Quantization (PTQ). For PTQ to give good results, users need to feed decent amount of data to quantize the model, for example, approximately 25-30 RAW images.
    2. Experiment with different model precisions, such as W8A16 and W16A16, to see if there is an improvement in model accuracy.
    3. Use AIMET for Post-Training Quantization (PTQ) and Quantization-Aware Training (QAT) techniques for advanced quantization. See the AIMET documentation for more details.
What are the best practices for model quantization using AI SDK?
- Prepare Calibration Data Use representative calibration data that closely matches the data the model will encounter in production. This helps accurately determine the scaling factors and zero points for quantization
- Choose the Right Quantization Method
  - Post-Training Quantization (PTQ): This method is simpler and faster. Suitable for models where slight accuracy loss is acceptable. It converts a pretrained floating-point model to a quantized model without retraining.
  - Quantization-Aware Training (QAT): This method involves training the model with quantization in mind, which can help maintain higher accuracy. It’s more complex but beneficial for models where accuracy is critical.
  See the AIMET documentation for detailed steps.
AI SDK provides tools to convert LiteRT to DLC. Do users have to convert LiteRT models to DLC or can LiteRT models accelerate directly on NPU ? You don’t necessarily have to convert LiteRT models to DLC to accelerate them on the NPU. Qualcomm’s AI SDK supports running LiteRT models directly on the NPU using the LiteRT delegate. This means you can leverage NPU capabilities without converting your LiteRT models to DLC format.
AI SDK provides tools to convert PyTorch, Onnx, and Tensorflow models to DLC. Which of these is the quickest path for deployment ? Converting your models to ONNX first and then to DLC is often the quickest and most flexible approach for deployment.
How does a user know, if the model is running on the NPU? Use the following tools from Qualcomm:
1. Qualcomm profiler
2. Sysmon
  If you have access to the Hexagon SDK, refer to the sysmon documentation at <Hexagon_sdk_path>/<version>/tools/sysmon_app.html
How do users get the benefit of heterogeneous AI engines of Snapdragon platform? How can users run different AI models on different Hardware cores? (GPU, NPU, etc.) AI SDK tools and APIs provide options to select the runtime (CPU, GPU or DSP). You can select the appropriate runtime as a command line parameter or use specific C/C++ APIs. See snpe-net-run for more details. The AI sample applications, by default, use the DSP runtime. You can change the runtime using sample application configuration. For example, in the gst-ai-object-detection sample app modify the runtime parameter in the config_detection.json file. Runtime:
- "cpu"
- "gpu"
- "dsp"
How to proceed if model conversion fails with AI SDK? Contact Qualcomm Support Forums for support.
A custom floating point model provides good accuracy on CPU, but quantized model accuracy is bad on both CPU and NPU. How to debug this? There could be problems related to model quantization, see user replaced model FAQ for guidance on model quantization
A custom quantized model is running as expected on CPU, but the same model isn’t running accurately on NPU. How to debug this? The qnn-net-run --debug option dumps layerwise values, so you can compare between CPU and NPU inference.
When running a sample app, there is no output on the HDMI screen. How to debug this? Follow display debugging to debug HDMI display issues.
Is it possible to use a HDMI TV instead of HDMI monitor to run sample applications? In most cases it will work. If it does not, use an HDMI monitor instead.

How to check hardware runtime for AI sample apps? Many GStreamer sample applications support inference on various runtimes (CPU, GPU, and DSP). To determine which runtime the app will use for inference, observe the logs.

CPU

Running app with model: /usr/models/inception_v3_quantized.tflite and labels: /usr/labels/classification.labels
Using CPU Delegate
Adding all elements to the pipeline...

GPU

Running app with model: /usr/models/inception_v3_quantized.tflite and labels: /usr/labels/classification.labels
Using GPU Delegate
Adding all elements to the pipeline...

DSP

Running app with model: /usr/models/inception_v3_quantized.tflite and labels: /usr/labels/classification.labels
Using DSP Delegate
Adding all elements to the pipeline...

What are devtool sanity check errors? How to debug them? Occasionally, you may see devtool showing sanity check errors. Ensure you have sudo access for the host computer. If the error persists, do the following workaround:
1. Update permissions.
  umask a+rx
2. Disable BitBake sanity checking in the $ESDK_ROOT/layers/poky/meta/conf/sanity.conf file.
  BB_MIN_VERSION = "1.53.1" SANITY_ABIFILE = "${TMPDIR}/abi_version" SANITY_VERSION ?= "1" LOCALCONF_VERSION ?= "2" LAYER_CONF_VERSION ?= "7" SITE_CONF_VERSION ?= "1" #INHERIT += "sanity"
Steps to sideload the Qualcomm Neural Processing Engine SDK on the target device.
1. Go to the target device shell and run the following commands:
  ssh root@<ip-address of target device>
  mount -o rw,remount /
  exit
2. Go to the SNPE SDK root folder on the host computer and run the following commands: The following commands are for QCS6490.
  - For QCS8275: Replace hexagon-v68 with hexagon-v75
  - For QCS9075: Replace hexagon-v68 with hexagon-v73
  scp lib/aarch64-oe-linux-gcc11.2/* root@<target-ip-address>:/usr/lib/
  scp lib/hexagon-v68/unsigned/* root@<target-ip-address>:/usr/lib/rfsa/adsp/
  scp bin/aarch64-oe-linux-gcc11.2/* root@<target-ip-address>:/usr/bin/
  ssh root@<target-ip-address>
  chmod -R 777 /usr/bin/
3. Validate the new SDK version:
  snpe-net-run --version
For more information about downloading Qualcomm AI Runtime SDK, see Qualcomm package manager.
Debug steps to take a preprocessed tensor from the QIM SDK and run inference using SNPE.
1. Generate preprocessed raw tensors from a video file.
  1. Create a folder to save the raw tensors to.
  mkdir -p /opt/frames/
  1. Run the GStreamer pipeline to dump the raw tensors to the folder.
  gst-launch-1.0 -v -e filesrc location=/etc/media/video.mp4 ! qtdemux ! queue ! h264parse ! v4l2h264dec capture-io-mode=4 output-io-mode=4 ! queue ! qtimlvconverter ! queue ! neural-network/tensors,type=FLOAT32,rate=30000/1000,dimensions="<<1,520,520,3>>" ! queue ! multifilesink location="/opt/frames/frame_%03d.raw"
  Adjust the rate (FPS of the input video) and the dimensions (input dimensions of the model) as needed for the video and model you are using.
2. Run inference with SNPE or QNN. i. Create an input_list.txt file containing the absolute paths to the raw files, as shown below. Inference is performed on each file listed in the input_list.txt file. Example content in input_list.txt
  /opt/frames/frame_000.raw /opt/frames/frame_001.raw
  1. Run the SNPE DLC or QNN model on the HTP backend.
  - SNPE
  - QNN
  snpe-net-run --container <model>.dlc --input_list input_list.txt --output_dir output_htp --use_dsp
  qnn-net-run --model <model>.so --backend libQnnHtp.so --input_list input_list.txt --output_dir output_htp
  For more details, see deploy a model using Neural Processing Engine or AI Engine Direct.

Further support

Ask your question on the Qualcomm support forum.

​Further support

Further support