-
-
Notifications
You must be signed in to change notification settings - Fork 28
Open
Labels
Description
Basic information
- Board URL (official): https://radxa.com/products/dragon/q6a/
- Board purchased from: AliExpress
- Board purchase date: October 27, 2025
- Board specs (as tested): 8GB RAM
- Board price (as tested): $141.39
Linux/system information
# output of `screenfetch`
./+o+- radxa@radxa-dragon-q6a
yyyyy- -yyyyyy+ OS: Ubuntu 24.04 noble
://+//////-yyyyyyo Kernel: aarch64 Linux 6.17.1-3-qcom
.++ .:/++++++/-.+sss/` Uptime: 1m
.:++o: /++++++++/:--:/- Packages: 1871
o:+o+:++.`..```.-/oo+++++/ Shell: dash
.:+o:+o/. `+sssoo+/ Disk: 6.3G / 30G (23%)
.++/+:+oo+o:` /sssooo. CPU: ARM Cortex-A55 Cortex-A78 @ 8x 1.9584GHz
/+++//+:`oo+o /::--:. GPU:
\+/+o+++`o++o ++////. RAM: 683MiB / 7604MiB
.++.o+++oo+:` /dddhhh.
.+.o+oo:. `oddhhhh+
\+.++o+o``-````.:ohdhhhhh+
`:o+++ `ohhhhhhhhyo++os:
.o:`.syhhhhhhh/.oo++o`
/osyyyyyyo++ooo+++/
````` +oo+++o\:
`oo++.
# output of `uname -a`
Linux radxa-dragon-q6a 6.17.1-3-qcom #3 SMP PREEMPT_DYNAMIC Wed Nov 5 14:13:05 UTC 2025 aarch64 aarch64 aarch64 GNU/Linux
System topology
Note: lstopo results may be missing some information on new and strange SoCs.
Benchmark results
CPU
- Geekbench 6: (1176 single / 3103 multi - https://browser.geekbench.com/v6/cpu/15371815)
- 48.408 Gflops at 10.1W for 4.79 Gflops/W (geerlingguy/top500-benchmark HPL result)
Power
- Idle power draw (at wall): 2.9 W
- Maximum simulated power draw (
stress-ng --matrix 0): 9.3 W - During Geekbench multicore benchmark: 9.5 W
- During
top500HPL benchmark: 10.1 W
Disk
Samsung Pro+ 32GB microSD
| Benchmark | Result |
|---|---|
| iozone 4K random read | 8.91 MB/s |
| iozone 4K random write | 0.80 MB/s |
| iozone 1M random read | 72.64 MB/s |
| iozone 1M random write | 2.32 MB/s |
| iozone 1M sequential read | 73.70 MB/s |
| iozone 1M sequential write | 61.45 MB/s |
Network
iperf3 results:
Built-in Ethernet (Realtek 1 Gbps)
iperf3 -c $SERVER_IP: 943 Mbpsiperf3 -c $SERVER_IP --reverse: 942 Mbpsiperf3 -c $SERVER_IP --bidir: 920 Mbps up, 937 Mbps down
Built-in WiFi (AICSemi AIC 8800D80 WiFi 6)
iperf3 -c $SERVER_IP: 187 Mbpsiperf3 -c $SERVER_IP --reverse: 177 Mbpsiperf3 -c $SERVER_IP --bidir: 113 Mbps up, 107 Mbps down
(Be sure to test all interfaces, noting any that are non-functional.)
GPU
glmark2
glmark2-es2 results:
ATTENTION: default value of option force_gl_vendor overridden by environment.
=======================================================
glmark2 2023.01
=======================================================
OpenGL Information
GL_VENDOR: notfreedreno
GL_RENDERER: FD643
GL_VERSION: 4.6 (Compatibility Profile) Mesa 25.0.7-0ubuntu0.24.04.2
Surface Config: buf=32 r=8 g=8 b=8 a=8 depth=24 stencil=0 samples=0
Surface Size: 800x600 windowed
=======================================================
[build] use-vbo=false: FPS: 1884 FrameTime: 0.531 ms
[build] use-vbo=true: FPS: 2078 FrameTime: 0.481 ms
[texture] texture-filter=nearest: FPS: 1914 FrameTime: 0.523 ms
[texture] texture-filter=linear: FPS: 1827 FrameTime: 0.548 ms
[texture] texture-filter=mipmap: FPS: 1814 FrameTime: 0.551 ms
[shading] shading=gouraud: FPS: 1865 FrameTime: 0.536 ms
[shading] shading=blinn-phong-inf: FPS: 1856 FrameTime: 0.539 ms
[shading] shading=phong: FPS: 1832 FrameTime: 0.546 ms
[shading] shading=cel: FPS: 1810 FrameTime: 0.553 ms
[bump] bump-render=high-poly: FPS: 1392 FrameTime: 0.719 ms
[bump] bump-render=normals: FPS: 2008 FrameTime: 0.498 ms
[bump] bump-render=height: FPS: 1907 FrameTime: 0.525 ms
[effect2d] kernel=0,1,0;1,-4,1;0,1,0;: FPS: 1673 FrameTime: 0.598 ms
[effect2d] kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;: FPS: 1249 FrameTime: 0.801 ms
[pulsar] light=false:quads=5:texture=false: FPS: 1901 FrameTime: 0.526 ms
[desktop] blur-radius=5:effect=blur:passes=1:separable=true:windows=4: FPS: 1194 FrameTime: 0.838 ms
[desktop] effect=shadow:windows=4: FPS: 1672 FrameTime: 0.598 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 330 FrameTime: 3.033 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=subdata: FPS: 553 FrameTime: 1.811 ms
[buffer] columns=200:interleave=true:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 730 FrameTime: 1.370 ms
[ideas] speed=duration: FPS: 1269 FrameTime: 0.788 ms
[jellyfish] <default>: FPS: 1668 FrameTime: 0.600 ms
[terrain] <default>: FPS: 218 FrameTime: 4.592 ms
[shadow] <default>: FPS: 1304 FrameTime: 0.767 ms
[refract] <default>: FPS: 224 FrameTime: 4.472 ms
[conditionals] fragment-steps=0:vertex-steps=0: FPS: 1840 FrameTime: 0.544 ms
[conditionals] fragment-steps=5:vertex-steps=0: FPS: 1803 FrameTime: 0.555 ms
[conditionals] fragment-steps=0:vertex-steps=5: FPS: 1795 FrameTime: 0.557 ms
[function] fragment-complexity=low:fragment-steps=5: FPS: 1788 FrameTime: 0.559 ms
[function] fragment-complexity=medium:fragment-steps=5: FPS: 1777 FrameTime: 0.563 ms
[loop] fragment-loop=false:fragment-steps=5:vertex-steps=5: FPS: 1782 FrameTime: 0.561 ms
[loop] fragment-steps=5:fragment-uniform=false:vertex-steps=5: FPS: 1747 FrameTime: 0.573 ms
[loop] fragment-steps=5:fragment-uniform=true:vertex-steps=5: FPS: 1758 FrameTime: 0.569 ms
=======================================================
glmark2 Score: 1528
=======================================================
vkmark
vkmark results:
ERROR: VkInstanceCreateInfo::pApplicationInfo::apiVersion has value of 0 which is not permitted. If apiVersion is not 0, then it must be greater than or equal to the value of VK_API_VERSION_1_0 [VUID-VkApplicationInfo-apiVersion]
ATTENTION: default value of option force_gl_vendor overridden by environment.
=======================================================
vkmark 2017.08
=======================================================
Vendor ID: 0x5143
Device ID: 0x6030500
Device Name: Turnip Adreno (TM) 643
Driver Version: 104857607
Device UUID: cc360f788b9de32391c52de0238d6409
=======================================================
[vertex] device-local=true: FPS: 3384 FrameTime: 0.296 ms
[vertex] device-local=false: FPS: 3572 FrameTime: 0.280 ms
[texture] anisotropy=0: FPS: 2996 FrameTime: 0.334 ms
[texture] anisotropy=16: FPS: 3010 FrameTime: 0.332 ms
[shading] shading=gouraud: FPS: 2196 FrameTime: 0.455 ms
[shading] shading=blinn-phong-inf: FPS: 2219 FrameTime: 0.451 ms
[shading] shading=phong: FPS: 2213 FrameTime: 0.452 ms
[shading] shading=cel: FPS: 2235 FrameTime: 0.447 ms
[effect2d] kernel=edge: FPS: 4825 FrameTime: 0.207 ms
[effect2d] kernel=blur: FPS: 2804 FrameTime: 0.357 ms
[desktop] <default>: FPS: 3122 FrameTime: 0.320 ms
[cube] <default>: FPS: 5063 FrameTime: 0.198 ms
[clear] <default>: FPS: 5321 FrameTime: 0.188 ms
=======================================================
vkmark Score: 3304
=======================================================
GravityMark
GravityMark results:
M: 0 us: ../data.zip: 313 files
M: 350 us: Temporal antialiasing
M: 374 us: Render Statistics
M: 48.97 ms: Build Date: Jun 20 2025
M: 49.11 ms: Build Info: version=20250429; linux; arm64; release; vk=1; gl=45; gles=32; cu=1; fusion
M: 49.15 ms: Build Version: 1.89
M: 52.98 ms: Name: Radxa Dragon Q6A
M: 53.12 ms: System: Ubuntu 24.04.3 LTS
M: 53.17 ms: Kernel: Linux 6.17.1-3-qcom aarch64
M: 53.21 ms: Memory: 7.43 GB
M: 53.27 ms: Uptime: 13 m 4 s
M: 53.30 ms: CPU: arm64
M: 62.04 ms: Desktop: 1920x1080 1.0
M: 65.64 ms: Screen 0: 1920x1080 0 0 HDMI-1
M: 65.87 ms: Creating 1600x900 OpenGL Window
M: 217.50 ms: Render Size: 1600x900
M: 217.65 ms: Using Fetch Mode
Segmentation fault
E: 24.376 s: Session::read(): can't exec "./GravityMark.arm64 -gl -ta 1 -a 200000 -fps 1 -info 1 -sensors 1 -name "geerlingguy" -benchmark 1" command
GravityMark wouldn't run with either Vulkan or OpenGL on the official image.
AI / LLM Inference
tinyllama-1.1b-1t-openorca.Q4_K_M.gguf
| model | size | params | backend | threads | test | t/s |
|---|---|---|---|---|---|---|
| llama 1B Q4_K - Medium | 636.18 MiB | 1.10 B | CPU | 8 | pp512 | 25.25 ± 0.11 |
| llama 1B Q4_K - Medium | 636.18 MiB | 1.10 B | CPU | 8 | pp4096 | 19.08 ± 0.02 |
| llama 1B Q4_K - Medium | 636.18 MiB | 1.10 B | CPU | 8 | tg128 | 15.23 ± 0.33 |
| llama 1B Q4_K - Medium | 636.18 MiB | 1.10 B | CPU | 8 | pp4096+tg128 | 18.22 ± 0.03 |
Power consumption: 7.8W
Llama-3.2-3B-Instruct-Q4_K_M.gguf
| model | size | params | backend | threads | test | t/s |
|---|---|---|---|---|---|---|
| llama 3B Q4_K - Medium | 1.87 GiB | 3.21 B | CPU | 8 | pp512 | 8.76 ± 0.13 |
| llama 3B Q4_K - Medium | 1.87 GiB | 3.21 B | CPU | 8 | pp4096 | 7.36 ± 0.03 |
| llama 3B Q4_K - Medium | 1.87 GiB | 3.21 B | CPU | 8 | tg128 | 4.81 ± 0.14 |
| llama 3B Q4_K - Medium | 1.87 GiB | 3.21 B | CPU | 8 | pp4096+tg128 | 7.00 ± 0.01 |
Power consumption: 7.9W
Memory
tinymembench results:
Click to expand memory benchmark result
tinymembench v0.4.10 (simple benchmark for memory throughput and latency)
==========================================================================
== Memory bandwidth tests ==
== ==
== Note 1: 1MB = 1000000 bytes ==
== Note 2: Results for 'copy' tests show how many bytes can be ==
== copied per second (adding together read and writen ==
== bytes would have provided twice higher numbers) ==
== Note 3: 2-pass copy means that we are using a small temporary buffer ==
== to first fetch data into it, and only then write it to the ==
== destination (source -> L1 cache, L1 cache -> destination) ==
== Note 4: If sample standard deviation exceeds 0.1%, it is shown in ==
== brackets ==
==========================================================================
C copy backwards : 8392.6 MB/s (0.3%)
C copy backwards (32 byte blocks) : 8391.2 MB/s
C copy backwards (64 byte blocks) : 8394.7 MB/s
C copy : 8392.8 MB/s (0.4%)
C copy prefetched (32 bytes step) : 8390.4 MB/s
C copy prefetched (64 bytes step) : 8391.6 MB/s
C 2-pass copy : 8463.9 MB/s
C 2-pass copy prefetched (32 bytes step) : 8423.7 MB/s
C 2-pass copy prefetched (64 bytes step) : 8443.1 MB/s (0.6%)
C fill : 19527.4 MB/s (0.3%)
C fill (shuffle within 16 byte blocks) : 19535.2 MB/s (0.3%)
C fill (shuffle within 32 byte blocks) : 19538.5 MB/s
C fill (shuffle within 64 byte blocks) : 19508.1 MB/s (0.2%)
NEON 64x2 COPY : 8393.2 MB/s
NEON 64x2x4 COPY : 8400.5 MB/s
NEON 64x1x4_x2 COPY : 8391.8 MB/s
NEON 64x2 COPY prefetch x2 : 8168.1 MB/s
NEON 64x2x4 COPY prefetch x1 : 8131.8 MB/s
NEON 64x2 COPY prefetch x1 : 8210.2 MB/s
NEON 64x2x4 COPY prefetch x1 : 8131.2 MB/s
---
standard memcpy : 8245.0 MB/s (0.6%)
standard memset : 19482.1 MB/s
---
NEON LDP/STP copy : 8413.0 MB/s
NEON LDP/STP copy pldl2strm (32 bytes step) : 8419.2 MB/s
NEON LDP/STP copy pldl2strm (64 bytes step) : 8412.2 MB/s
NEON LDP/STP copy pldl1keep (32 bytes step) : 8389.6 MB/s
NEON LDP/STP copy pldl1keep (64 bytes step) : 8393.5 MB/s
NEON LD1/ST1 copy : 8409.5 MB/s
NEON STP fill : 19529.1 MB/s (0.8%)
NEON STNP fill : 19544.8 MB/s (0.3%)
ARM LDP/STP copy : 8397.4 MB/s
ARM STP fill : 19553.9 MB/s
ARM STNP fill : 19529.9 MB/s
==========================================================================
== Framebuffer read tests. ==
== ==
== Many ARM devices use a part of the system memory as the framebuffer, ==
== typically mapped as uncached but with write-combining enabled. ==
== Writes to such framebuffers are quite fast, but reads are much ==
== slower and very sensitive to the alignment and the selection of ==
== CPU instructions which are used for accessing memory. ==
== ==
== Many x86 systems allocate the framebuffer in the GPU memory, ==
== accessible for the CPU via a relatively slow PCI-E bus. Moreover, ==
== PCI-E is asymmetric and handles reads a lot worse than writes. ==
== ==
== If uncached framebuffer reads are reasonably fast (at least 100 MB/s ==
== or preferably >300 MB/s), then using the shadow framebuffer layer ==
== is not necessary in Xorg DDX drivers, resulting in a nice overall ==
== performance improvement. For example, the xf86-video-fbturbo DDX ==
== uses this trick. ==
==========================================================================
NEON LDP/STP copy (from framebuffer) : 1661.2 MB/s
NEON LDP/STP 2-pass copy (from framebuffer) : 1341.1 MB/s
NEON LD1/ST1 copy (from framebuffer) : 1660.8 MB/s
NEON LD1/ST1 2-pass copy (from framebuffer) : 1340.9 MB/s
ARM LDP/STP copy (from framebuffer) : 757.7 MB/s
ARM LDP/STP 2-pass copy (from framebuffer) : 745.0 MB/s
==========================================================================
== Memory latency test ==
== ==
== Average time is measured for random memory accesses in the buffers ==
== of different sizes. The larger is the buffer, the more significant ==
== are relative contributions of TLB, L1/L2 cache misses and SDRAM ==
== accesses. For extremely large buffer sizes we are expecting to see ==
== page table walk with several requests to SDRAM for almost every ==
== memory access (though 64MiB is not nearly large enough to experience ==
== this effect to its fullest). ==
== ==
== Note 1: All the numbers are representing extra time, which needs to ==
== be added to L1 cache latency. The cycle timings for L1 cache ==
== latency can be usually found in the processor documentation. ==
== Note 2: Dual random read means that we are simultaneously performing ==
== two independent memory accesses at a time. In the case if ==
== the memory subsystem can't handle multiple outstanding ==
== requests, dual random read has the same timings as two ==
== single reads performed one after another. ==
==========================================================================
block size : single random read / dual random read, [MADV_NOHUGEPAGE]
1024 : 0.0 ns / 0.0 ns
2048 : 0.0 ns / 0.0 ns
4096 : 0.0 ns / 0.0 ns
8192 : 0.0 ns / 0.0 ns
16384 : 0.0 ns / 0.0 ns
32768 : 0.0 ns / 0.0 ns
65536 : 0.9 ns / 1.3 ns
131072 : 1.5 ns / 1.8 ns
262144 : 1.9 ns / 2.2 ns
524288 : 8.4 ns / 12.4 ns
1048576 : 12.6 ns / 16.0 ns
2097152 : 14.9 ns / 17.1 ns
4194304 : 59.4 ns / 88.8 ns
8388608 : 99.1 ns / 131.0 ns
16777216 : 121.0 ns / 145.0 ns
33554432 : 138.5 ns / 157.4 ns
67108864 : 151.5 ns / 168.1 ns
block size : single random read / dual random read, [MADV_HUGEPAGE]
1024 : 0.0 ns / 0.0 ns
2048 : 0.0 ns / 0.0 ns
4096 : 0.0 ns / 0.0 ns
8192 : 0.0 ns / 0.0 ns
16384 : 0.0 ns / 0.0 ns
32768 : 0.0 ns / 0.0 ns
65536 : 1.2 ns / 1.7 ns
131072 : 1.6 ns / 2.0 ns
262144 : 1.9 ns / 2.2 ns
524288 : 8.3 ns / 11.8 ns
1048576 : 11.7 ns / 14.8 ns
2097152 : 13.5 ns / 15.7 ns
4194304 : 58.2 ns / 87.2 ns
8388608 : 98.0 ns / 129.3 ns
16777216 : 118.3 ns / 142.3 ns
33554432 : 128.5 ns / 146.8 ns
67108864 : 133.5 ns / 148.5 ns
Core to Core Memory Latency
sbc-bench results
Run sbc-bench and paste a link to the results here: https://github.com/ThomasKaiser/sbc-bench/blob/master/results/reviews/Radxa-Dragon-Q6A.md (linking to existing results, as my run was similar, and most numbers were within 1%)
Phoronix Test Suite
Results from pi-general-benchmark.sh:
- pts/encode-mp3: 10.467 sec
- pts/x264 4K: 3.63 fps
- pts/x264 1080p: 22.93 fps
- pts/phpbench: 543386
- pts/build-linux-kernel (defconfig): 3554.437 sec
Reactions are currently unavailable
