> ## Documentation Index
> Fetch the complete documentation index at: https://dragonwingdocs.qualcomm.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Runtime performance tuning

Qualcomm<sup>®</sup> Linux exposes sysfs interfaces, kernel parameters, and device tree controls for tuning runtime performance. This page covers CPU frequency scaling, memory subsystem knobs, block I/O scheduling, Qualcomm-specific DCVS and DDR bandwidth controls, and hardware performance profiling.

## **CPU performance tuning**

### Select the CPU frequency governor

Qualcomm Linux uses `schedutil` as the default CPUfreq governor, with one policy per CPU cluster. On QCS6490 there are three clusters: little (cpu0–cpu3, Cortex-A55), big (cpu4–cpu6, Cortex-A78), and prime (cpu7, Cortex-A78). On IQ-9075 there are four clusters.

Read the current governor for all policies:

```bash theme={null}
cat /sys/devices/system/cpu/cpufreq/policy*/scaling_governor
```

Change the governor for all policies:

```bash theme={null}
# Maximum throughput, disables frequency scaling
for p in /sys/devices/system/cpu/cpufreq/policy*/; do
    echo performance > "${p}scaling_governor"
done

# Restore adaptive scheduling
for p in /sys/devices/system/cpu/cpufreq/policy*/; do
    echo schedutil > "${p}scaling_governor"
done
```

**Table: CPUfreq sysfs frequency controls**

| sysfs path                         | Description                                |
| ---------------------------------- | ------------------------------------------ |
| `cpufreq/policyN/scaling_max_freq` | Maximum frequency cap for cluster N (Hz)   |
| `cpufreq/policyN/scaling_min_freq` | Minimum frequency floor for cluster N (Hz) |
| `cpufreq/policyN/scaling_cur_freq` | Currently selected frequency (read-only)   |
| `cpufreq/policyN/cpuinfo_max_freq` | Hardware maximum frequency (read-only)     |

Paths are relative to `/sys/devices/system/cpu/`.

### Tune schedutil responsiveness

The `rate_limit_us` tunable prevents schedutil from changing frequency more often than the specified interval in microseconds. Lower values improve responsiveness at the cost of more frequency transitions:

```bash theme={null}
echo 200 > /sys/devices/system/cpu/cpufreq/policy4/schedutil/rate_limit_us
```

For Energy Aware Scheduling (EAS), utilization clamping (uclamp), and CPU topology details, see [Configure the scheduler](./configure-the-scheduler).

## **CPU isolation and IRQ affinity**

For latency-sensitive workloads such as real-time audio or deterministic sensor pipelines, isolate CPUs from the general scheduler and redirect interrupts away from them.

### Isolate CPUs at boot

Add the following parameters to the kernel command line:

```bash theme={null}
isolcpus=6,7 rcu_nocbs=6,7 nohz_full=6,7 irqaffinity=0-5
```

<Note>
  `isolcpus` is a static boot-time mechanism. For dynamic CPU isolation without a reboot, use `cgroup cpuset`.
</Note>

### Configure IRQ affinity at runtime

List all IRQs with their current CPU affinity:

```bash theme={null}
for i in /proc/irq/*/smp_affinity_list; do
    echo "$i: $(cat $i)"
done
```

Move a specific IRQ to a designated CPU:

```bash theme={null}
echo 4 > /proc/irq/<irq-number>/smp_affinity_list
```

Replace the following:

* `<irq-number>` by the interrupt number from `/proc/interrupts`.

### Run processes on isolated CPUs

```bash theme={null}
taskset -c <cpu-list> <command>
```

Replace the following:

* `<cpu-list>` by comma-separated or range CPU list, for example `6,7` or `4-7`.
* `<command>` by the process name or path to execute.

Change the CPU affinity of a running process:

```bash theme={null}
taskset -pc <cpu-list> <pid>
```

Replace the following:

* `<pid>` by the process ID to modify.

## **Memory performance tuning**

### Transparent Huge Pages (THP)

Transparent Huge Pages (THP) reduce TLB pressure for memory-intensive workloads. The default Qualcomm kernel sets THP to `madvise` mode (application opt-in).

```bash theme={null}
# Check current mode
cat /sys/kernel/mm/transparent_hugepage/enabled

# Enable system-wide
echo always > /sys/kernel/mm/transparent_hugepage/enabled

# Disable to reduce latency spikes from huge page allocation
echo never > /sys/kernel/mm/transparent_hugepage/enabled
```

Required Kconfig:

```text theme={null}
CONFIG_TRANSPARENT_HUGEPAGE=y
CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y
```

### Virtual memory sysfs knobs

**Table: Key /proc/sys/vm tuning parameters**

| Parameter                   | Effect                                                                                                                                                         |
| --------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `vm/swappiness`             | Controls tendency to swap anonymous memory versus reclaiming page cache. Lower (10–20) on RAM-constrained boards to keep anonymous memory in RAM. Default: 60. |
| `vm/dirty_ratio`            | Maximum percentage of memory holding dirty pages before processes are throttled to write. Increase (30–40) for write-heavy workloads. Default: 20.             |
| `vm/dirty_background_ratio` | Percentage at which background writeback starts. Default: 10.                                                                                                  |
| `vm/vfs_cache_pressure`     | Tendency to reclaim dentries and inodes. Lower (50) to keep file system metadata cached; raise (200) for embedded targets with limited RAM. Default: 100.      |
| `vm/min_free_kbytes`        | Watermark below which the kernel starts reclaiming. Raise for real-time latency requirements to avoid OOM-induced latency spikes.                              |
| `vm/watermark_boost_factor` | Temporarily raises watermarks after a spike in allocation pressure. Set to 0 to disable. Default: 15000 (150%).                                                |

Paths are relative to `/proc/sys/`.

### ZRAM and CMA

For ZRAM swap configuration and CMA region tuning, see [Configure and manage memory](./configure-and-manage-memory).

## **Block I/O tuning**

### Select the I/O scheduler

Qualcomm platforms typically use UFS or eMMC storage with multi-queue blk-mq. View the current scheduler for all block devices:

```bash theme={null}
for dev in /sys/block/*/queue/scheduler; do
    echo "$dev: $(cat $dev)"
done
```

**Table: Recommended I/O schedulers by storage type**

| Storage type | Scheduler         | Rationale                                                                                              |
| ------------ | ----------------- | ------------------------------------------------------------------------------------------------------ |
| UFS          | `none`            | UFS controller handles its own command queueing (UFS Command Queue). Passthrough gives lowest latency. |
| eMMC         | `mq-deadline`     | Deadline-based ordering prevents read starvation on mixed workloads.                                   |
| NVMe         | `none` or `kyber` | Token-bucket latency targeting is effective for SSDs.                                                  |

Set the scheduler for a device:

```bash theme={null}
echo none > /sys/block/<device>/queue/scheduler
```

Replace the following:

* `<device>` by the block device name, for example `sda`.

### Tune read-ahead

```bash theme={null}
# Increase for sequential workloads such as firmware update or logging
echo 512 > /sys/block/<device>/queue/read_ahead_kb

# Decrease for random access workloads such as databases
echo 128 > /sys/block/<device>/queue/read_ahead_kb
```

Replace the following:

* `<device>` by the block device name, for example `sda`.

## **Qualcomm DCVS and DDR bandwidth**

Qualcomm Dynamic Clock and Voltage Scaling (DCVS) manages the frequency of the L3/LLC cache, Last Level Cache Controller (LLCC), and DDR. Two mechanisms drive memory frequency votes: a static OPP table that ties CPU frequency to memory frequencies, and the BWMON governor that votes based on measured bus traffic.

### Static CPU-to-memory mapping

The `qcom-cpufreq-hw` driver reads `opp-peak-kBps` from each CPU OPP entry in the DTSI. When the CPU runs at a given frequency, the driver votes for the corresponding L3 and DDR bandwidth. Example entry from a QCS6490 DTSI:

```text theme={null}
cpu0_opp_1516mhz: opp-1516800000 {
    opp-hz       = /bits/ 64 <1516800000>;
    opp-peak-kBps = <5400000 51200000>;  /* DDR kBps, L3 kBps */
};
```

Increasing `opp-peak-kBps` values votes for higher memory frequencies at lower CPU frequencies, trading power for lower memory latency. The driver source is `drivers/cpufreq/qcom-cpufreq-hw.c`.

### BWMON governor

The `icc-bwmon` driver (`drivers/soc/qcom/icc-bwmon.c`) samples CPU-to-LLCC and CPU-to-DDR bandwidth counters and casts ICC bandwidth votes accordingly. Tune the sampling period and thresholds in the device tree:

```text theme={null}
&bwmon_cpu {
    qcom,count-unit     = <0x1000>;  /* 4 KB per count */
    qcom,threshold-high = <400>;
    qcom,threshold-med  = <200>;
    qcom,threshold-low  = <50>;
    qcom,sample-ms      = <4>;       /* 4 ms sampling period */
};
```

Monitor active ICC bandwidth votes at runtime:

```bash theme={null}
cat /sys/kernel/debug/interconnect/interconnect_summary
```

For DVFS governor selection and cache frequency mapping details, see [Configure the dynamic voltage and frequency scaling (DVFS) governors](./configure-the-dynamic-voltage-and-frequency-scaling-dvfs-governors).

## **Hardware performance profiling**

ARM64 Qualcomm SoCs expose hardware performance counters through the ARM PMUv3 interface. The `perf` tool uses these to profile CPU utilization, cache misses, and memory stalls.

### Required Kconfig

```text theme={null}
CONFIG_PERF_EVENTS=y
CONFIG_HW_PERF_EVENTS=y
CONFIG_ARM_PMU=y
CONFIG_QCOM_L2_PMU=y
CONFIG_QCOM_L3_PMU=y
```

### Collect a system-wide performance snapshot

```bash theme={null}
perf stat -a -e cycles,instructions,cache-misses,cache-references sleep 10
```

### Profile a specific process

```bash theme={null}
perf record -g -p <pid> -- sleep 5
perf report
```

Replace the following:

* `<pid>` by the process ID to profile.

### Qualcomm L3 PMU events

The Qualcomm L3 PMU driver exposes per-slice L3 cache events with the `qcom_l3_cache/` prefix:

```bash theme={null}
perf stat -e qcom_l3_cache/event=0x21/ -a sleep 5
```

### Live profiling with perf top

```bash theme={null}
perf top -g --call-graph dwarf
```
