Skip to content

DC-ROMA AI PC - RISC-V Mainboard II #82

@geerlingguy

Description

@geerlingguy

Image

Basic information

Image

Linux/system information

# output of `screenfetch`
$ screenfetch
                          ./+o+-       roma@roma
                  yyyyy- -yyyyyy+      OS: Ubuntu 24.04 noble
               ://+//////-yyyyyyo      Kernel: riscv64 Linux 6.6.92-eic7x-2025.07
           .++ .:/++++++/-.+sss/`      Uptime: 9m
         .:++o:  /++++++++/:--:/-      Packages: 1954
        o:+o+:++.`..```.-/oo+++++/     Shell: bash 5.2.21
       .:+o:+o/.          `+sssoo+/    Disk: 25G / 468G (6%)
  .++/+:+oo+o:`             /sssooo.   CPU: Unknown @ 8x 1.8GHz
 /+++//+:`oo+o               /::--:.   GPU: 
 \+/+o+++`o++o               ++////.   RAM: 1529MiB / 14682MiB
  .++.o+++oo+:`             /dddhhh.  
       .+.o+oo:.          `oddhhhh+   
        \+.++o+o``-````.:ohdhhhhh+    
         `:o+++ `ohhhhhhhhyo++os:     
           .o:`.syhhhhhhh/.oo++o`     
               /osyyyyyyo++ooo+++/    
                   ````` +oo+++o\:    
                          `oo++.    

# output of `uname -a`
Linux roma 6.6.92-eic7x-2025.07 #2025.09.25.06.45+ SMP Thu Sep 25 06:53:10 UTC 2025 riscv64 riscv64 riscv64 GNU/Linux

Benchmark results

CPU

Power

  • Sleep power draw (at wall): 2.2 W
  • Idle power draw (charging, battery 80%): 41.4 W
  • Idle power draw (at wall, battery 100%): 25.1 W
  • Maximum simulated power draw (stress-ng --matrix 0): 31.7 W
  • During Geekbench multicore benchmark: 32.1 W
  • During top500 HPL benchmark: 32.9 W

Disk

ZHITAI TiPlus7100 512GB

Benchmark Result
iozone 4K random read 60.31 MB/s
iozone 4K random write 126.92 MB/s
iozone 1M random read 1042.36 MB/s
iozone 1M random write 1306.80 MB/s
iozone 1M sequential read 1076.06 MB/s
iozone 1M sequential write 1301.79 MB/s

Network

iperf3 results:

WLAN (WiFi 6, built-in Intel AX200)

  • iperf3 -c $SERVER_IP: 637 Mbps
  • iperf3 -c $SERVER_IP --reverse: 293 Mbps
  • iperf3 -c $SERVER_IP --bidir: 510 Mbps up, 141 Mbps down

(Be sure to test all interfaces, noting any that are non-functional.)

GPU

The device includes a PowerVR A-Series AXM-8-256, with some precompiled drivers for OpenGL and Vulkan, but compatibility seemed a little hit or miss...

glmark2

glmark2-es2-wayland results:

=======================================================
    glmark2 2023.01
=======================================================
    OpenGL Information
    GL_VENDOR:      Imagination Technologies
    GL_RENDERER:    PowerVR A-Series AXM-8-256
    GL_VERSION:     OpenGL ES 3.2 build 24.2@6643903
    Surface Config: buf=32 r=8 g=8 b=8 a=8 depth=24 stencil=0 samples=0
    Surface Size:   800x600 windowed
=======================================================
[build] use-vbo=false: FPS: 205 FrameTime: 4.884 ms
[build] use-vbo=true: FPS: 475 FrameTime: 2.106 ms
[texture] texture-filter=nearest: FPS: 525 FrameTime: 1.907 ms
[texture] texture-filter=linear: FPS: 542 FrameTime: 1.846 ms
[texture] texture-filter=mipmap: FPS: 534 FrameTime: 1.875 ms
[shading] shading=gouraud: FPS: 471 FrameTime: 2.126 ms
[shading] shading=blinn-phong-inf: FPS: 493 FrameTime: 2.029 ms
[shading] shading=phong: FPS: 513 FrameTime: 1.953 ms
[shading] shading=cel: FPS: 475 FrameTime: 2.107 ms
[bump] bump-render=high-poly: FPS: 546 FrameTime: 1.832 ms
[bump] bump-render=normals: FPS: 548 FrameTime: 1.825 ms
[bump] bump-render=height: FPS: 535 FrameTime: 1.871 ms
[effect2d] kernel=0,1,0;1,-4,1;0,1,0;: FPS: 492 FrameTime: 2.035 ms
[effect2d] kernel=1,1,1,1,1;1,1,1,1,1;1,1,1,1,1;: FPS: 602 FrameTime: 1.661 ms
[pulsar] light=false:quads=5:texture=false: FPS: 536 FrameTime: 1.868 ms
[desktop] blur-radius=5:effect=blur:passes=1:separable=true:windows=4: FPS: 116 FrameTime: 8.670 ms
[desktop] effect=shadow:windows=4: FPS: 956 FrameTime: 1.046 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 243 FrameTime: 4.121 ms
[buffer] columns=200:interleave=false:update-dispersion=0.9:update-fraction=0.5:update-method=subdata: FPS: 242 FrameTime: 4.144 ms
[buffer] columns=200:interleave=true:update-dispersion=0.9:update-fraction=0.5:update-method=map: FPS: 362 FrameTime: 2.763 ms
[ideas] speed=duration: FPS: 693 FrameTime: 1.443 ms
[jellyfish] <default>: FPS: 1729 FrameTime: 0.579 ms
[terrain] <default>: FPS: 117 FrameTime: 8.606 ms
[shadow] <default>: FPS: 1505 FrameTime: 0.665 ms
[refract] <default>: FPS: 196 FrameTime: 5.122 ms
[conditionals] fragment-steps=0:vertex-steps=0: FPS: 2185 FrameTime: 0.458 ms
[conditionals] fragment-steps=5:vertex-steps=0: FPS: 2212 FrameTime: 0.452 ms
[conditionals] fragment-steps=0:vertex-steps=5: FPS: 1473 FrameTime: 0.679 ms
[function] fragment-complexity=low:fragment-steps=5: FPS: 2198 FrameTime: 0.455 ms
[function] fragment-complexity=medium:fragment-steps=5: FPS: 2294 FrameTime: 0.436 ms
[loop] fragment-loop=false:fragment-steps=5:vertex-steps=5: FPS: 2337 FrameTime: 0.428 ms
[loop] fragment-steps=5:fragment-uniform=false:vertex-steps=5: FPS: 2297 FrameTime: 0.435 ms
[loop] fragment-steps=5:fragment-uniform=true:vertex-steps=5: FPS: 2287 FrameTime: 0.437 ms
=======================================================
                                  glmark2 Score: 936 
=======================================================

vkmark

vkmark results:

$ DISPLAY=:0 vkmark --debug
Debug: WindowSystemLoader: Looking in /usr/lib/riscv64-linux-gnu/vkmark for window system plugins
Debug: WindowSystemLoader: Loading options from /usr/lib/riscv64-linux-gnu/vkmark/kms.so... ok
Debug: WindowSystemLoader: Loading options from /usr/lib/riscv64-linux-gnu/vkmark/wayland.so... ok
Debug: WindowSystemLoader: Loading options from /usr/lib/riscv64-linux-gnu/vkmark/xcb.so... ok
Debug: WindowSystemLoader: Probing /usr/lib/riscv64-linux-gnu/vkmark/kms.so... succeeded with priority 255
Debug: WindowSystemLoader: Probing /usr/lib/riscv64-linux-gnu/vkmark/wayland.so... succeeded with priority 255
Authorization required, but no authorization protocol specified

Debug: WindowSystemLoader: Probing /usr/lib/riscv64-linux-gnu/vkmark/xcb.so... succeeded with priority 0
Debug: WindowSystemLoader: Selected window system plugin /usr/lib/riscv64-linux-gnu/vkmark/kms.so (best match)
Debug: KMSWindowSystemPlugin: Using legacy modesetting
Segmentation fault (core dumped)

With a version compiled from source, I got:

Error: No suitable Vulkan physical devices found

Here is the vulkaninfo for the board:

Click to expand `vulkaninfo`
==========
VULKANINFO
==========

Vulkan Instance Version: 1.3.275


Instance Extensions: count = 23
-------------------------------
VK_EXT_acquire_drm_display             : extension revision 1
VK_EXT_acquire_xlib_display            : extension revision 1
VK_EXT_debug_report                    : extension revision 10
VK_EXT_debug_utils                     : extension revision 2
VK_EXT_direct_mode_display             : extension revision 1
VK_EXT_display_surface_counter         : extension revision 1
VK_EXT_surface_maintenance1            : extension revision 1
VK_EXT_swapchain_colorspace            : extension revision 4
VK_KHR_device_group_creation           : extension revision 1
VK_KHR_display                         : extension revision 23
VK_KHR_external_fence_capabilities     : extension revision 1
VK_KHR_external_memory_capabilities    : extension revision 1
VK_KHR_external_semaphore_capabilities : extension revision 1
VK_KHR_get_display_properties2         : extension revision 1
VK_KHR_get_physical_device_properties2 : extension revision 2
VK_KHR_get_surface_capabilities2       : extension revision 1
VK_KHR_portability_enumeration         : extension revision 1
VK_KHR_surface                         : extension revision 25
VK_KHR_surface_protected_capabilities  : extension revision 1
VK_KHR_wayland_surface                 : extension revision 6
VK_KHR_xcb_surface                     : extension revision 6
VK_KHR_xlib_surface                    : extension revision 6
VK_LUNARG_direct_driver_loading        : extension revision 1

Instance Layers: count = 2
--------------------------
VK_LAYER_MESA_device_select Linux device selection layer 1.3.211  version 1
VK_LAYER_MESA_overlay       Mesa Overlay layer           1.3.211  version 1

Devices:
========
GPU0:
	apiVersion         = 1.3.277
	driverVersion      = 1.598.191
	vendorID           = 0x1010
	deviceID           = 0x30010101
	deviceType         = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU
	deviceName         = PowerVR A-Series AXM-8-256
	driverID           = DRIVER_ID_IMAGINATION_PROPRIETARY
	driverName         = PowerVR A-Series Vulkan Driver
	driverInfo         = 24.2@6643903
	conformanceVersion = 1.3.8.1
	deviceUUID         = 33302033-2034-3038-2031-303100000000
	driverUUID         = 36363433-3930-3300-0000-000000000000

GravityMark

GravityMark results:

1. Download the latest version of GravityMark: https://gravitymark.tellusim.com
2. Run `chmod +x [downloaded_filename].run`
3. Run `sudo ./[downloaded_filename].run` and press `y` to accept the terms.
4. Open the link it prints, and run the Benchmark defaults, changing to 720p resolution and 50,000 asteroids.

Note: These benchmarks require an active display on the device. Not all devices may be able to run glmark2-es2, so in that case, make a note and move on!

AI / LLM Inference

ollama LLM model inference results:

NPU Inference

System CPU/GPU Model Eval Rate Power (Peak)
DC-ROMA Mainboard II (8-core RISC-V) NPU deepseek-r1:7b 4.9 Tokens/s 38.9 W

CPU Inference

System CPU/GPU Model Eval Rate Power (Peak)
DC-ROMA Mainboard II (8-core RISC-V) CPU deepseek-r1:1.5b 0.59 Tokens/s 32.0 W
DC-ROMA Mainboard II (8-core RISC-V) CPU llama3.2:3b 0.31 Tokens/s 30.6 W

More results: geerlingguy/ai-benchmarks#28

Memory

tinymembench results:

Click to expand memory benchmark result
tinymembench v0.4.10 (simple benchmark for memory throughput and latency)

==========================================================================
== Memory bandwidth tests                                               ==
==                                                                      ==
== Note 1: 1MB = 1000000 bytes                                          ==
== Note 2: Results for 'copy' tests show how many bytes can be          ==
==         copied per second (adding together read and writen           ==
==         bytes would have provided twice higher numbers)              ==
== Note 3: 2-pass copy means that we are using a small temporary buffer ==
==         to first fetch data into it, and only then write it to the   ==
==         destination (source -> L1 cache, L1 cache -> destination)    ==
== Note 4: If sample standard deviation exceeds 0.1%, it is shown in    ==
==         brackets                                                     ==
==========================================================================

 C copy backwards                                     :   4818.5 MB/s (0.2%)
 C copy backwards (32 byte blocks)                    :   4826.0 MB/s
 C copy backwards (64 byte blocks)                    :   4842.2 MB/s (0.2%)
 C copy                                               :   4800.1 MB/s
 C copy prefetched (32 bytes step)                    :   4825.5 MB/s (32.0%)
 C copy prefetched (64 bytes step)                    :    864.5 MB/s (0.2%)
 C 2-pass copy                                        :    717.7 MB/s
 C 2-pass copy prefetched (32 bytes step)             :    716.7 MB/s
 C 2-pass copy prefetched (64 bytes step)             :    716.8 MB/s
 C fill                                               :   7789.3 MB/s (28.7%)
 C fill (shuffle within 16 byte blocks)               :   7794.9 MB/s
 C fill (shuffle within 32 byte blocks)               :   7837.7 MB/s (0.4%)
 C fill (shuffle within 64 byte blocks)               :   7804.5 MB/s
 ---
 standard memcpy                                      :   4335.0 MB/s
 standard memset                                      :   7837.1 MB/s (0.2%)

==========================================================================
== Memory latency test                                                  ==
==                                                                      ==
== Average time is measured for random memory accesses in the buffers   ==
== of different sizes. The larger is the buffer, the more significant   ==
== are relative contributions of TLB, L1/L2 cache misses and SDRAM      ==
== accesses. For extremely large buffer sizes we are expecting to see   ==
== page table walk with several requests to SDRAM for almost every      ==
== memory access (though 64MiB is not nearly large enough to experience ==
== this effect to its fullest).                                         ==
==                                                                      ==
== Note 1: All the numbers are representing extra time, which needs to  ==
==         be added to L1 cache latency. The cycle timings for L1 cache ==
==         latency can be usually found in the processor documentation. ==
== Note 2: Dual random read means that we are simultaneously performing ==
==         two independent memory accesses at a time. In the case if    ==
==         the memory subsystem can't handle multiple outstanding       ==
==         requests, dual random read has the same timings as two       ==
==         single reads performed one after another.                    ==
==========================================================================

block size : single random read / dual random read, [MADV_NOHUGEPAGE]
      1024 :    0.0 ns          /     0.0 ns 
      2048 :    0.0 ns          /     0.0 ns 
      4096 :    0.0 ns          /     0.0 ns 
      8192 :    0.0 ns          /     0.0 ns 
     16384 :    0.0 ns          /     0.0 ns 
     32768 :    0.0 ns          /     0.0 ns 
     65536 :    2.5 ns          /     3.9 ns 
    131072 :    3.8 ns          /     5.2 ns 
    262144 :    7.3 ns          /    10.5 ns 
    524288 :   15.8 ns          /    22.3 ns 
   1048576 :   20.2 ns          /    26.5 ns 
   2097152 :   22.8 ns          /    28.3 ns 
   4194304 :   51.9 ns          /    79.3 ns 
   8388608 :  117.8 ns          /   165.3 ns 
  16777216 :  153.1 ns          /   194.1 ns 
  33554432 :  172.3 ns          /   208.8 ns 
  67108864 :  185.7 ns          /   222.5 ns 

block size : single random read / dual random read, [MADV_HUGEPAGE]
      1024 :    0.0 ns          /     0.0 ns 
      2048 :    0.0 ns          /     0.0 ns 
      4096 :    0.0 ns          /     0.0 ns 
      8192 :    0.0 ns          /     0.0 ns 
     16384 :    0.0 ns          /     0.0 ns 
     32768 :    0.0 ns          /     0.0 ns 
     65536 :    2.5 ns          /     3.9 ns 
    131072 :    3.8 ns          /     5.2 ns 
    262144 :    4.7 ns          /     6.1 ns 
    524288 :   12.1 ns          /    16.4 ns 
   1048576 :   16.1 ns          /    19.4 ns 
   2097152 :   17.7 ns          /    20.3 ns 
   4194304 :   45.2 ns          /    66.7 ns 
   8388608 :  106.8 ns          /   146.5 ns 
  16777216 :  137.7 ns          /   168.9 ns 
  33554432 :  153.1 ns          /   175.6 ns 
  67108864 :  164.9 ns          /   183.9 ns 

Core to Core Memory Latency

Image

See discussion about memory access improvements on this system.

sbc-bench results

See: ThomasKaiser/sbc-bench#125

Phoronix Test Suite

Results from pi-general-benchmark.sh:

  • pts/encode-mp3: TODO sec
  • pts/x264 4K: TODO fps
  • pts/x264 1080p: TODO fps
  • pts/phpbench: 104733
  • pts/build-linux-kernel (defconfig): 2852.438 sec

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions