Skip to content

Create 2024-07-28-Analysis-memory-access-pattern-has-never-been-easie…#12

Open
torusrxxx wants to merge 4 commits into
gh-pagesfrom
torusrxxx-patch-1
Open

Create 2024-07-28-Analysis-memory-access-pattern-has-never-been-easie…#12
torusrxxx wants to merge 4 commits into
gh-pagesfrom
torusrxxx-patch-1

Conversation

@torusrxxx

Copy link
Copy Markdown
Member

…r-now.md

@mrexodia

Copy link
Copy Markdown
Member

I think it's a nice post, but it would be good to have some example application/crackme and some screenshots/diagrams. I'm happy to work on that, just need a good example use case that people would resonate with.

@torusrxxx

Copy link
Copy Markdown
Member Author

I don't have time to write a crackme tutorial now, there isn't any crackme on my Github profile. Finding a good example use case shouldn't be hard. For example, it makes it possible to dump a self modifying executable even when you don't know OEP before tracing. It also accelerates many common operations, like jumping to the last iteration of a loop.

This feature aligns with the general trend that you just leave the computer to record everything, have a cup of coffee, and then analyze the trace recording with more powerful tools available: the default xref analysis can never be better than this one based on tracing.

@mrexodia

mrexodia commented Aug 5, 2024

Copy link
Copy Markdown
Member

Didn't forget about this, just need to find some time to create screenshots to add to the post and then share it on social media.

@torusrxxx

Copy link
Copy Markdown
Member Author

Half a year later, it's no longer news

@mrexodia

Copy link
Copy Markdown
Member

My apologies, I completely forgot about this post 🤦‍♂️.
I think it’s still relevant to explain new features and workflows, many people do not keep up with the latest developments and I think it will be news for them!

@torusrxxx

Copy link
Copy Markdown
Member Author

It looks like more and more users are interested in this technology!

@mrexodia

Copy link
Copy Markdown
Member

Totally forgot about it again 😅 If you're good with publishing it as-is I can adjust the filename and publish it. For me it reads a bit like a 'wall of text' without any concrete use case or screenshots etc, but up to you!

@torusrxxx

Copy link
Copy Markdown
Member Author

If you have a better idea, I'm looking forward to that. I'm not a good crackme tutorial writer.

@mrexodia

mrexodia commented Jul 14, 2025

Copy link
Copy Markdown
Member

For me personally the issue is that I have never actually used x64dbg's tracing myself outside of testing it/reproducing bugs, so it's a bit difficult to come up with examples 😅

I obviously see the use of the memory view, we could demo that easily with some basic XOR encryption thingy and then show that you can see the memory update 'live' and you click through the trace.

For the cross-reference search I already struggle a bit to find a realistic use case. Would the idea be to breakpoint on some API (WriteFile) where you get the address of the buffer and then do a cross-reference to that address in the trace? In practice nobody will trace more than 1-10 million instructions (which takes 2-20 minutes). You would instead look at the call stack on the breakpoint to try and reduce the scope of the analysis.

For the pattern search one idea I came up with is where a syscall instruction is decrypted, executed, re-encrypted, so you could look for the opcode to see when it first appears. But practically it would be easier to search for the instruction in the trace? Just difficult to see where you would get the pattern to search for without knowing the address. Maybe you have an API that is dynamically resolved, then stored (encrypted) for later, then a bunch of unrelated/fake code and then API address fetched from memory, decrypted and executed?

Ages ago there was a 'Run trace' tutorial posted on the OllyDbg website (https://ollydbg.de/Tut_rtr.htm) there the use case is that the stack was wiped and a crash was difficult to recover, but even this post mentions needing 10 minutes to collect the trace 😬

Did you use the tracing for some personal reverse engineering projects? Could be helpful to come up with a simpler example that could fit in the post...

Edit: a friend of mine also mentioned 'catching a buffer overrun' as a potentially practical example. Could be a simple arena allocator and then something like:

overflow_buf = arena.allocate(100)
hello_buf = arena.allocate(25)
strcpy(hello_buf, "Hello!")
puts(hello_buf)
something_long(overflow_buf)
puts(hello_buf) # prints something else now

Then you can jump to the last write on hello_buf[0] to find where the overflow happened.

@torusrxxx

Copy link
Copy Markdown
Member Author

So what happened?

@torusrxxx

Copy link
Copy Markdown
Member Author

I intended to use the pattern search in trace with a private signature database like Yara. It instantly identifies known features no matter where it appears. But first you need to compile such a list (cryptographic constants etc), which is outside the scope of a tutorial. Cross-reference search is to interactively follow the access pattern of a variable, like finding references in Visual Studio.

@mrexodia

Copy link
Copy Markdown
Member

I have been busy with work and other projects and completely forgot about the post. Without a clearly-motivated use case it feels a bit lacking, but if you want I'm okay with publishing it as-is.

@torusrxxx

Copy link
Copy Markdown
Member Author

I'm not creating a crackme for this as I'm busy with other priorities. If you are busy as well, then nothing can be done :( .

@mrexodia

mrexodia commented May 31, 2026

Copy link
Copy Markdown
Member

Good that we have clankers!

trace-memory.zip

Confirmed the flag is found for the trace_rc4.exe, there is also the initial RC4 state of 01 02 03 04...

image

I also tried the buffer overflow example, but I think there is something broken about the xref functionality right now:

image

The entry at 0x1CF is actually writing to the address 0000000140076C70, but the earlier instructions do not write there are they just repeated because there is one xref from 00000001400010AF to 0000000140076C70 and then all instructions at 00000001400010AF are shown?

There is also a bug/unclear behavior where if an exception happens the instruction triggering it will not be recorded to the trace right away. For the overflow example I had to step in another time to reach KiUserExceptionDispatcher and then follow in dump from that one.


x64dbg trace-memory crackmes

These programs are small inputs for the trace dump, trace Xref, and trace pattern-search features.

Build

From this directory in Git Bash:

mkdir -p build
"/c/Program Files/LLVM/bin/clang.exe" -O0 -gcodeview -fuse-ld=lld -Xlinker -debug -Xlinker -dynamicbase:no -Xlinker -highentropyva:no -o build/trace_rc4.exe trace_rc4.c
"/c/Program Files/LLVM/bin/clang.exe" -O0 -gcodeview -fuse-ld=lld -Xlinker -debug -Xlinker -dynamicbase:no -Xlinker -highentropyva:no -o build/trace_overflow.exe trace_overflow.c

The linker flags -dynamicbase:no -highentropyva:no are the lld-link equivalents of /DYNAMICBASE:NO /HIGHENTROPYVA:NO. They keep module addresses stable across runs.

The examples export marker functions for convenient breakpoints. trace_rc4.exe uses trace_begin and trace_end. trace_overflow.exe uses only trace_begin; the access violation stops the trace.

Example 1: trace_rc4.exe

Correct serial:

x64dbg{trace-memory}

Suggested run:

trace_rc4.exe wrong

Trace workflow:

  1. Open trace_rc4.exe in x64dbg.
  2. Set the command line to trace_rc4.exe wrong if needed.
  3. Open the Symbols view for the main module.
  4. Set breakpoints on the exported functions trace_begin and trace_end.
  5. Run to trace_begin.
  6. Start Trace into from trace_begin.
  7. The trace stops when trace_end is reached.
  8. Open the Trace view, select the last traced instruction, and load the trace dump if x64dbg has not loaded it automatically.

Things to show:

  • The console prints the stack addresses of S and expected.
  • In the trace dump, go to the expected address and step through the trace rows around rc4_crypt; the buffer changes into the plaintext serial.
  • Search the trace dump for this ASCII pattern:
78 36 34 64 62 67 7B

That is the byte pattern for x64dbg{. It is not stored as plaintext in the program; it appears only after the RC4 decrypt loop writes it.

  • Search for this RC4 initialization pattern:
00 01 02 03 04 05 06 07

The result points at the temporary S[256] table created during rc4_init.

  • Select a byte in the expected buffer and use the trace dump Xref action. The useful references are the RC4 write, the compare read, and the wipe write.

Example 2: trace_overflow.exe

Trace workflow:

  1. Open trace_overflow.exe in x64dbg.
  2. Set a breakpoint on the exported function trace_begin.
  3. Run to trace_begin.
  4. Start Trace into from trace_begin. Set the max trace count to 100000 if x64dbg asks.
  5. The trace stops on the access violation caused by call_callback.
  6. Open the Trace view, select the last trace row, and load the trace dump if needed.

Things to show:

  • The console prints the addresses of overflow, greeting, and the callback field.
  • The crash happens because g_arena.callback was overwritten with 43 43 43 43 43 43 43 43, the ASCII bytes for CCCCCCCC.
  • In the trace dump, go to the printed callback field address, select the first byte, and use the trace dump Xref action.
  • Jump to the last write before the crash. It lands inside copy_unbounded(g_arena.overflow, ...), after the copy has crossed the end of overflow.
  • Repeat the same action on the printed greeting address. Its bytes changed to greeting_is_gone, showing that the same overflow corrupted neighboring data before it corrupted the function pointer.

This is the cleanest motivating example for trace Xref: it answers "who last wrote the byte that caused this crash?" without needing a hardware breakpoint before the bug happens.


// trace_rc4.c
// A tiny crackme for x64dbg trace dump, trace xrefs, and trace pattern search.
// Build: clang.exe -O0 -gcodeview -fuse-ld=lld -Xlinker -debug -Xlinker -dynamicbase:no -Xlinker -highentropyva:no -o trace_rc4.exe trace_rc4.c

#include <stdint.h>
#include <stdio.h>
#include <stddef.h>

#if defined(_MSC_VER) || defined(__clang__)
#define NOINLINE __declspec(noinline)
#define EXPORT __declspec(dllexport)
#else
#define NOINLINE __attribute__((noinline))
#define EXPORT
#endif

#define SECRET_LEN 20

// Encrypted RC4 output for the plaintext serial used by this demo.
// The plaintext does not appear as a string in the executable.
static const uint8_t encrypted_serial[SECRET_LEN] = {
    0x7F, 0x20, 0xC2, 0xBC, 0xB3, 0xF4, 0x87, 0x9F, 0xFE, 0xA6,
    0x06, 0x34, 0x6E, 0x18, 0x62, 0x54, 0x42, 0xC4, 0x73, 0x27
};

static const uint8_t rc4_key[] = "x64dbg-trace-demo";

EXPORT NOINLINE void trace_begin(void)
{
    // Set a breakpoint here in x64dbg, then start Trace Into.
}

EXPORT NOINLINE void trace_end(void)
{
    // Set a breakpoint here in x64dbg, so tracing stops after the demo.
}

NOINLINE void rc4_init(uint8_t s[256], const uint8_t *key, size_t key_len)
{
    uint32_t i;
    uint8_t j = 0;

    for (i = 0; i < 256; i++)
        s[i] = (uint8_t)i;

    for (i = 0; i < 256; i++) {
        uint8_t tmp;
        j = (uint8_t)(j + s[i] + key[i % key_len]);
        tmp = s[i];
        s[i] = s[j];
        s[j] = tmp;
    }
}

NOINLINE void rc4_crypt(uint8_t s[256], const uint8_t *input, uint8_t *output, size_t len)
{
    size_t n;
    uint8_t i = 0;
    uint8_t j = 0;

    for (n = 0; n < len; n++) {
        uint8_t tmp;
        uint8_t k;

        i = (uint8_t)(i + 1);
        j = (uint8_t)(j + s[i]);

        tmp = s[i];
        s[i] = s[j];
        s[j] = tmp;

        k = s[(uint8_t)(s[i] + s[j])];
        output[n] = (uint8_t)(input[n] ^ k);
    }
}

NOINLINE size_t bounded_strlen(const char *s, size_t max_len)
{
    size_t n = 0;
    while (n <= max_len && s[n] != 0)
        n++;
    return n;
}

NOINLINE int slow_equals(const char *user, const uint8_t *expected, size_t expected_len)
{
    size_t i;
    size_t user_len = bounded_strlen(user, expected_len);
    uint8_t diff = 0;

    for (i = 0; i < expected_len; i++) {
        uint8_t c = 0;
        if (i < user_len)
            c = (uint8_t)user[i];
        diff |= (uint8_t)(c ^ expected[i]);
    }

    return diff == 0 && user_len == expected_len;
}

NOINLINE void wipe_bytes(void *ptr, size_t len)
{
    volatile uint8_t *p = (volatile uint8_t *)ptr;
    size_t i;
    for (i = 0; i < len; i++)
        p[i] = 0;
}

int main(int argc, char **argv)
{
    uint8_t s[256];
    uint8_t expected[SECRET_LEN + 1];
    const char *user = argc > 1 ? argv[1] : "wrong";
    int ok;
    size_t i;

    for (i = 0; i < sizeof(expected); i++)
        expected[i] = 0;

    printf("trace_rc4: hidden serial check\n");
    printf("  S table:         %p (256 bytes)\n", (void *)s);
    printf("  expected serial: %p (%u bytes)\n", (void *)expected, (unsigned)SECRET_LEN);
    printf("  user input:      %s\n", user);
    printf("Set breakpoints on trace_begin and trace_end, then trace from trace_begin.\n");

    trace_begin();

    rc4_init(s, rc4_key, sizeof(rc4_key) - 1);
    rc4_crypt(s, encrypted_serial, expected, SECRET_LEN);
    expected[SECRET_LEN] = 0;

    ok = slow_equals(user, expected, SECRET_LEN);

    wipe_bytes(expected, sizeof(expected));
    wipe_bytes(s, 256);

    trace_end();

    puts(ok ? "correct" : "wrong");
    return ok ? 0 : 1;
}

// trace_overflow.c
// A tiny crash-investigation demo for x64dbg trace dump and trace xrefs.
// Build: clang.exe -O0 -gcodeview -fuse-ld=lld -Xlinker -debug -Xlinker -dynamicbase:no -Xlinker -highentropyva:no -o trace_overflow.exe trace_overflow.c

#include <stdio.h>
#include <stdint.h>
#include <stddef.h>

#if defined(_MSC_VER) || defined(__clang__)
#define NOINLINE __declspec(noinline)
#define EXPORT __declspec(dllexport)
#else
#define NOINLINE __attribute__((noinline))
#define EXPORT
#endif

typedef void (*Callback)(void);

typedef struct Arena {
    char overflow[16];
    char greeting[16];
    Callback callback;
    char scratch[8];
} Arena;

static Arena g_arena;

EXPORT NOINLINE void trace_begin(void)
{
    // Set a breakpoint here in x64dbg, then start Trace Into.
    // The access violation stops the trace; this demo does not need trace_end.
}

NOINLINE void safe_callback(void)
{
    puts("safe callback");
}

NOINLINE void zero_arena(Arena *arena)
{
    volatile uint8_t *p = (volatile uint8_t *)arena;
    size_t i;
    for (i = 0; i < sizeof(*arena); i++)
        p[i] = 0;
}

NOINLINE void copy_unbounded(char *dst, const char *src)
{
    while ((*dst++ = *src++) != 0) {
        // Deliberately no bounds check.
    }
}

NOINLINE void call_callback(void)
{
    g_arena.callback();
}

int main(void)
{
    setvbuf(stdout, NULL, _IONBF, 0);

    zero_arena(&g_arena);
    copy_unbounded(g_arena.greeting, "Hello!");
    g_arena.callback = safe_callback;

    printf("trace_overflow: crash after a buffer overflow\n");
    printf("  overflow buffer: %p (16 bytes)\n", (void *)g_arena.overflow);
    printf("  greeting buffer: %p (16 bytes)\n", (void *)g_arena.greeting);
    printf("  callback field:  %p (8 bytes)\n", (void *)&g_arena.callback);
    printf("  callback value:  %p\n", (void *)g_arena.callback);
    printf("  greeting before overflow: %s\n", g_arena.greeting);
    printf("Set a breakpoint on trace_begin, then trace until the access violation.\n");

    trace_begin();

    copy_unbounded(g_arena.overflow,
        "AAAAAAAAAAAAAAAA"
        "greeting_is_gone"
        "CCCCCCCC");

    call_callback();

    return 0;
}

@torusrxxx

Copy link
Copy Markdown
Member Author

Currently, x64dbg trace viewer doesn't care the actual access size of the instruction, and always treat the access size is the pointer size. It can be improved by relating the memory address to the disassembled operand, but that's currently not supported, and will likely be slower to build the index.

Proper exception handling was in the roadmap that never started. It remains unsupported, but since it still works, the current status is more like it is bugged and not tested.

@torusrxxx

Copy link
Copy Markdown
Member Author

Thank you for your samples and contributions!

@mrexodia

Copy link
Copy Markdown
Member

I just pushed two attempts at fixing both issues and it seems to work much better now!

@mrexodia

Copy link
Copy Markdown
Member

Also spent some time to update the post, fix some grammar and add a section with the concrete examples in the front. I think I found another bug, because searching for x64dbg{ will find the index where only x64dbg exists in memory and the { is not written until quite a bit later.

@torusrxxx

Copy link
Copy Markdown
Member Author

The tracer doesn't capture the entire memory dump at the start of trace file. It assumes memory is unchanged if the traced instruction doesn't touch the memory. It interacts with partial dump indexing in an interesting way: the content in the memory dump shows 00's for a buffer that isn't used, but if you click at the last instruction to force indexing the entire trace and come back, then the content isn't 00's anymore because the tracer now knows the content from the future. In practice this difference is mostly harmless, it only happens in unused buffers.

The tracer partially mitigate this issue by always capturing pointer-sized accesses. If the memory access size is 1 byte, the tracer still record the remaining bytes in the trace file. It then knows content past the end of the buffer. When the actual access size of the operand is enforced, the tracer discards saved information, and this changes the behavior because tracer now forgot the unused byte past the end of the buffer.

@mrexodia

mrexodia commented Jun 1, 2026

Copy link
Copy Markdown
Member

The issue with x64dbg{ specifically was fixed, it looks like a minor problem reporting the result not with the actual search. The extra memory being recorded is still used, only the search/reference logic was changed a bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants