When calling convention conversion is not needed, you will have
5 byte jmp into your function
5 byte jmp to original function
5-16 bytes stolen from original function
If it happens to be that we stole <= 6 bytes, which may be common in x86, we can safely inline the instructions to call original into the swap space to save 16 bytes of memory.