[GC] Could SVR::memcopy be more efficient?

The ServerGC threads in my application spend approx. 50% of their time in `SVR::memcopy`

https://github.com/dotnet/runtime/blob/731a96b0018cda0c67bb3fab7a4d06538fe2ead5/src/coreclr/gc/gc.cpp#L1746-L1755

In x64 the relevant loop looks like this:
```asm
@loop:
mov rax, qword ptr [r10+r9*1]
mov qword ptr [r9], rax
lea r9, ptr [r9+0x8]
sub r11, 0x1
jnz 0x18016b901 <loop>
```

If I'm not mistaken, this is a regular memcpy, but without vectorization or other optimizations? Perhaps SVR::memcopy could be implemented with `memcpy` instead, which is more optimized?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[GC] Could SVR::memcopy be more efficient? #110571

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[GC] Could SVR::memcopy be more efficient? #110571

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions