feat(x86_64/mm): log page tables#1818
Merged
Merged
Conversation
2531186 to
fe97988
Compare
There was a problem hiding this comment.
Benchmark Results
This comment was automatically generated by github-action-benchmark.
Misc
| Benchmark | Current: 12f3c0e | Previous: be0a653 | Performance Ratio |
|---|---|---|---|
| micro_benchmarks Build Time | 71.77 s |
70.42 s |
1.02 |
| micro_benchmarks File Size | 0.97 MB |
0.96 MB |
1.01 |
| Scheduling time - 1 thread | 62.36 ticks (±3.76 ticks) |
60.24 ticks (±3.06 ticks) |
1.04 |
| Scheduling time - 2 threads | 32.66 ticks (±1.81 ticks) |
31.86 ticks (±1.93 ticks) |
1.02 |
| Micro - Time for syscall (getpid) | 14.72 ticks (±1.38 ticks) |
14.90 ticks (±1.32 ticks) |
0.99 |
| Memcpy speed - (built_in) block size 4096 | 75644.11 MByte/s (±52451.99 MByte/s) |
81525.14 MByte/s (±56187.03 MByte/s) |
0.93 |
| Memcpy speed - (built_in) block size 1048576 | 42344.05 MByte/s (±29346.41 MByte/s) |
42926.86 MByte/s (±29715.31 MByte/s) |
0.99 |
| Memcpy speed - (built_in) block size 16777216 | 28462.40 MByte/s (±23474.79 MByte/s) |
28315.38 MByte/s (±23305.84 MByte/s) |
1.01 |
| Memset speed - (built_in) block size 4096 | 75930.43 MByte/s (±52663.60 MByte/s) |
81983.14 MByte/s (±56500.99 MByte/s) |
0.93 |
| Memset speed - (built_in) block size 1048576 | 42584.61 MByte/s (±29508.21 MByte/s) |
43135.37 MByte/s (±29857.76 MByte/s) |
0.99 |
| Memset speed - (built_in) block size 16777216 | 29153.26 MByte/s (±23851.90 MByte/s) |
29073.62 MByte/s (±23753.59 MByte/s) |
1.00 |
| Memcpy speed - (rust) block size 4096 | 70912.77 MByte/s (±49349.33 MByte/s) |
71810.42 MByte/s (±50069.06 MByte/s) |
0.99 |
| Memcpy speed - (rust) block size 1048576 | 42559.34 MByte/s (±29504.36 MByte/s) |
43009.84 MByte/s (±29784.76 MByte/s) |
0.99 |
| Memcpy speed - (rust) block size 16777216 | 28357.51 MByte/s (±23312.52 MByte/s) |
28887.29 MByte/s (±23767.73 MByte/s) |
0.98 |
| Memset speed - (rust) block size 4096 | 70912.77 MByte/s (±49349.33 MByte/s) |
72206.11 MByte/s (±50360.23 MByte/s) |
0.98 |
| Memset speed - (rust) block size 1048576 | 42816.03 MByte/s (±29673.72 MByte/s) |
43219.94 MByte/s (±29927.79 MByte/s) |
0.99 |
| Memset speed - (rust) block size 16777216 | 29114.60 MByte/s (±23749.88 MByte/s) |
29651.26 MByte/s (±24214.31 MByte/s) |
0.98 |
| alloc_benchmarks Build Time | 69.04 s |
65.53 s |
1.05 |
| alloc_benchmarks File Size | 0.93 MB |
0.92 MB |
1.01 |
| Allocations - Allocation success | 100.00 % |
100.00 % |
1 |
| Allocations - Deallocation success | 69.99 % (±0.30 %) |
69.94 % (±0.30 %) |
1.00 |
| Allocations - Pre-fail Allocations | 100.00 % |
100.00 % |
1 |
| Allocations - Average Allocation time | 13504.78 Ticks (±200.99 Ticks) |
13283.14 Ticks (±172.79 Ticks) |
1.02 |
| Allocations - Average Allocation time (no fail) | 13504.78 Ticks (±200.99 Ticks) |
13283.14 Ticks (±172.79 Ticks) |
1.02 |
| Allocations - Average Deallocation time | 849.93 Ticks (±124.58 Ticks) |
841.59 Ticks (±118.38 Ticks) |
1.01 |
| mutex_benchmark Build Time | 66.58 s |
65.60 s |
1.01 |
| mutex_benchmark File Size | 0.97 MB |
0.96 MB |
1.01 |
| Mutex Stress Test Average Time per Iteration - 1 Threads | 12.42 ns (±0.57 ns) |
12.62 ns (±0.91 ns) |
0.98 |
| Mutex Stress Test Average Time per Iteration - 2 Threads | 14.72 ns (±1.55 ns) |
15.36 ns (±6.61 ns) |
0.96 |
General
| Benchmark | Current: 12f3c0e | Previous: be0a653 | Performance Ratio |
|---|---|---|---|
| startup_benchmark Build Time | 73.25 s |
73.75 s |
0.99 |
| startup_benchmark File Size | 0.86 MB |
0.85 MB |
1.01 |
| Startup Time - 1 core | 1.00 s (±0.03 s) |
1.01 s (±0.02 s) |
0.99 |
| Startup Time - 2 cores | 1.02 s (±0.02 s) |
0.99 s (±0.03 s) |
1.03 |
| Startup Time - 4 cores | 1.03 s (±0.03 s) |
1.01 s (±0.04 s) |
1.02 |
| multithreaded_benchmark Build Time | 75.86 s |
73.77 s |
1.03 |
| multithreaded_benchmark File Size | 0.96 MB |
0.96 MB |
1.01 |
| Multithreaded Pi Efficiency - 2 Threads | 90.16 % (±11.66 %) |
93.09 % (±6.52 %) |
0.97 |
| Multithreaded Pi Efficiency - 4 Threads | 60.78 % (±10.16 %) |
64.83 % (±5.56 %) |
0.94 |
| Multithreaded Pi Efficiency - 8 Threads | 42.71 % (±3.59 %) |
45.19 % (±3.25 %) |
0.95 |
Misc
| Benchmark | Current: 12f3c0e | Previous: b5b5c19 | Performance Ratio |
|---|---|---|---|
| micro_benchmarks Build Time | 91.42 s |
74.72 s |
1.22 |
| micro_benchmarks File Size | 0.97 MB |
0.97 MB |
1.00 |
| Scheduling time - 1 thread | 74.48 ticks (±2.44 ticks) |
66.83 ticks (±3.10 ticks) |
1.11 |
| Scheduling time - 2 threads | 38.81 ticks (±1.76 ticks) |
35.64 ticks (±3.40 ticks) |
1.09 |
| Micro - Time for syscall (getpid) | 16.09 ticks (±1.21 ticks) |
15.71 ticks (±1.36 ticks) |
1.02 |
| Memcpy speed - (built_in) block size 4096 | 73216.96 MByte/s (±50701.98 MByte/s) |
73338.41 MByte/s (±50542.56 MByte/s) |
1.00 |
| Memcpy speed - (built_in) block size 1048576 | 41106.91 MByte/s (±28564.86 MByte/s) |
41237.34 MByte/s (±28631.07 MByte/s) |
1.00 |
| Memcpy speed - (built_in) block size 16777216 | 27251.59 MByte/s (±22502.76 MByte/s) |
26159.73 MByte/s (±21200.42 MByte/s) |
1.04 |
| Memset speed - (built_in) block size 4096 | 73258.41 MByte/s (±50731.01 MByte/s) |
73371.89 MByte/s (±50563.99 MByte/s) |
1.00 |
| Memset speed - (built_in) block size 1048576 | 41354.92 MByte/s (±28738.66 MByte/s) |
41509.80 MByte/s (±28815.09 MByte/s) |
1.00 |
| Memset speed - (built_in) block size 16777216 | 27972.69 MByte/s (±22916.87 MByte/s) |
26803.32 MByte/s (±21575.74 MByte/s) |
1.04 |
| Memcpy speed - (rust) block size 4096 | 65243.78 MByte/s (±45697.76 MByte/s) |
66444.83 MByte/s (±46279.47 MByte/s) |
0.98 |
| Memcpy speed - (rust) block size 1048576 | 41441.16 MByte/s (±28769.38 MByte/s) |
41301.42 MByte/s (±28653.77 MByte/s) |
1.00 |
| Memcpy speed - (rust) block size 16777216 | 26803.14 MByte/s (±22238.38 MByte/s) |
26198.19 MByte/s (±21210.46 MByte/s) |
1.02 |
| Memset speed - (rust) block size 4096 | 65758.74 MByte/s (±46033.51 MByte/s) |
66823.77 MByte/s (±46541.34 MByte/s) |
0.98 |
| Memset speed - (rust) block size 1048576 | 41689.27 MByte/s (±28940.09 MByte/s) |
41550.11 MByte/s (±28821.87 MByte/s) |
1.00 |
| Memset speed - (rust) block size 16777216 | 27479.30 MByte/s (±22613.58 MByte/s) |
26858.59 MByte/s (±21598.51 MByte/s) |
1.02 |
| alloc_benchmarks Build Time | 90.21 s |
72.59 s |
1.24 |
| alloc_benchmarks File Size | 0.93 MB |
0.92 MB |
1.00 |
| Allocations - Allocation success | 100.00 % |
100.00 % |
1 |
| Allocations - Deallocation success | 69.98 % (±0.33 %) |
70.01 % (±0.26 %) |
1.00 |
| Allocations - Pre-fail Allocations | 100.00 % |
100.00 % |
1 |
| Allocations - Average Allocation time | 13483.79 Ticks (±393.70 Ticks) |
11052.09 Ticks (±196.08 Ticks) |
1.22 |
| Allocations - Average Allocation time (no fail) | 13483.79 Ticks (±393.70 Ticks) |
11052.09 Ticks (±196.08 Ticks) |
1.22 |
| Allocations - Average Deallocation time | 873.61 Ticks (±92.21 Ticks) |
833.82 Ticks (±17.80 Ticks) |
1.05 |
| mutex_benchmark Build Time | 88.68 s |
73.82 s |
1.20 |
| mutex_benchmark File Size | 0.97 MB |
0.97 MB |
1.00 |
| Mutex Stress Test Average Time per Iteration - 1 Threads | 14.34 ns (±0.62 ns) |
14.18 ns (±0.65 ns) |
1.01 |
| Mutex Stress Test Average Time per Iteration - 2 Threads | 17.92 ns (±7.71 ns) |
16.92 ns (±1.06 ns) |
1.06 |
General
| Benchmark | Current: 12f3c0e | Previous: b5b5c19 | Performance Ratio |
|---|---|---|---|
| startup_benchmark Build Time | 75.24 s |
88.81 s |
0.85 |
| startup_benchmark File Size | 0.86 MB |
0.86 MB |
1.00 |
| Startup Time - 1 core | 0.99 s (±0.02 s) |
1.00 s (±0.05 s) |
0.99 |
| Startup Time - 2 cores | 0.99 s (±0.02 s) |
1.02 s (±0.04 s) |
0.98 |
| Startup Time - 4 cores | 1.03 s (±0.03 s) |
1.04 s (±0.04 s) |
0.99 |
| multithreaded_benchmark Build Time | 73.22 s |
89.82 s |
0.82 |
| multithreaded_benchmark File Size | 0.96 MB |
0.96 MB |
1.00 |
| Multithreaded Pi Efficiency - 2 Threads | 91.52 % (±6.66 %) |
93.59 % (±11.09 %) |
0.98 |
| Multithreaded Pi Efficiency - 4 Threads | 63.82 % (±4.38 %) |
65.34 % (±7.96 %) |
0.98 |
| Multithreaded Pi Efficiency - 8 Threads | 43.67 % (±3.86 %) |
30.99 % (±2.66 %) |
1.41 |
a645c5c to
af58849
Compare
d31a532 to
23eb6ef
Compare
b15388d to
3176d4f
Compare
Contributor
|
Why does the CI fail? The PR itself look good for me. |
stlankes
approved these changes
Jul 8, 2025
Member
Author
The benchmark CI fails because printing the whole page table is slow, and we get a timeout. Printing on |
b15388d to
12f3c0e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR removes the unused debugging functions
paging::disect<PT: Translate>(pt: PT, virt_addr: x86_64::VirtAddr),paging::print_page_table_entries(page_table_indices: &[PageTableIndex]), andpaging::print_page_tables(levels: usize);and adds
paging::log_page_tables()for logging the complete page table in a compact table before and after modifying it with
log::Level::Info.This PR adds parts of the
x86_64crate's privatePageTableWalkerandPhysOffset.We may want to try to upstream this PR's
MappedPageTableIterandMappedPageTableRangeInclusiveIter, but it might face a lot of bike shedding, so I propose first adding them to our code before upstreaming and migrating in the future.This depends on rust-osdev/x86_64#556.