Skip to content

Protect patching with an open handle #1264

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
100 commits
Select commit Hold shift + click to select a range
a210c87
#1250 validate that the obtained dlopen handle is same as the origina…
Baraa-Hasheesh Apr 24, 2025
295b14e
rename varaiable
Baraa-Hasheesh Apr 24, 2025
165cd18
Merge branch 'master' into protect-symbol-parsing
Baraa-Hasheesh Apr 25, 2025
4fc704a
Add useful doc comment
Baraa-Hasheesh Apr 25, 2025
547a727
functions for main exec, ld check, original handle
fandreuz Apr 25, 2025
4bda095
protect patches
fandreuz Apr 25, 2025
46d7a25
nn
fandreuz Apr 25, 2025
5fb3a15
nn
fandreuz Apr 25, 2025
9897f7e
remove comment
fandreuz Apr 25, 2025
08e8e05
nn
fandreuz Apr 25, 2025
1c5c9ae
fix compilation on macos
fandreuz Apr 25, 2025
0c74ccf
move include
fandreuz Apr 25, 2025
003224b
fix macos
fandreuz Apr 25, 2025
4a10fe5
nn
fandreuz Apr 25, 2025
5abfd2f
include syms
fandreuz Apr 25, 2025
5c6a2c9
stuff
fandreuz Apr 25, 2025
e6ce338
change variable under check
fandreuz Apr 25, 2025
5653968
move to cpp
fandreuz Apr 25, 2025
03d8f79
move stuff away from h
fandreuz Apr 25, 2025
ef47cb2
op
fandreuz Apr 25, 2025
b2b8b68
known lib crash nativemem
fandreuz Apr 25, 2025
cc0b9d9
spurious bracket
fandreuz Apr 25, 2025
efae089
nn
fandreuz Apr 25, 2025
1feaf12
nn
fandreuz Apr 25, 2025
19a552c
Update test/test/nativemem/nativemem_known_lib_crash.c
fandreuz Apr 25, 2025
dbb4d67
nn
fandreuz Apr 28, 2025
79d93a0
raii
fandreuz Apr 28, 2025
108598c
Merge remote-tracking branch 'fandreuz/sim-294-keep-handle-while-patc…
fandreuz Apr 28, 2025
85d2ebb
isvalid
fandreuz Apr 28, 2025
964d025
const. import
fandreuz Apr 28, 2025
aa02f9e
move dtor away
fandreuz Apr 28, 2025
842e2da
move ctor
fandreuz Apr 28, 2025
92944da
move ass. delete cp
fandreuz Apr 28, 2025
a38d144
nn
fandreuz Apr 28, 2025
683825d
Merge branch 'master' into sim-294-keep-handle-while-patching
fandreuz Apr 28, 2025
c2e56c8
needed import
fandreuz Apr 28, 2025
3dd17e6
rename
fandreuz Apr 29, 2025
646cbf7
Merge branch 'master' into sim-294-keep-handle-while-patching
fandreuz Apr 29, 2025
5378fcf
move 'makeUnloadProtection' into UP ctor
fandreuz Apr 29, 2025
b479c11
fix test for alpine
fandreuz Apr 29, 2025
2c69485
move make imports patchable
fandreuz Apr 29, 2025
d837bc9
set flag
fandreuz Apr 29, 2025
20f5f81
cc
fandreuz Apr 29, 2025
79da179
nn
fandreuz Apr 29, 2025
457d59a
Merge branch 'master' into sim-294-keep-handle-while-patching
fandreuz Apr 30, 2025
fa529c8
Merge branch 'master' into sim-294-keep-handle-while-patching
fandreuz May 1, 2025
15eeade
use image_base
fandreuz May 1, 2025
d58c662
Update test/test/nativemem/nativemem_known_lib_crash.c
fandreuz May 1, 2025
cda6051
stop ap
fandreuz May 1, 2025
a6b3804
log
fandreuz May 1, 2025
bc30666
Revert "use image_base"
fandreuz May 1, 2025
bc19fc3
remove tood
fandreuz May 1, 2025
15da753
trigger
fandreuz May 1, 2025
14a8a55
trigger
fandreuz May 1, 2025
8de5e01
debug
fandreuz May 2, 2025
8f57df7
unloadprotection to symbols
fandreuz May 2, 2025
1edc7f6
missing image base
fandreuz May 2, 2025
4e5fb57
nn
fandreuz May 2, 2025
5cff23d
clean lib name
fandreuz May 2, 2025
50ead5b
Merge branch 'master' into sim-294-keep-handle-while-patching
fandreuz May 2, 2025
0eacfab
cc
fandreuz May 2, 2025
7cdfa80
nn
fandreuz May 2, 2025
b310dba
move patchimprot
fandreuz May 2, 2025
fa729eb
nn
fandreuz May 2, 2025
c17d592
cc
fandreuz May 2, 2025
94dd779
quote
fandreuz May 2, 2025
7d4cb93
trigger
fandreuz May 2, 2025
45e9622
check musl
fandreuz May 2, 2025
b5d4e19
inline
fandreuz May 7, 2025
5e10cdd
use the stuff on macos
fandreuz May 7, 2025
70c3800
patch from cc
fandreuz May 7, 2025
4542d1d
trigger
fandreuz May 7, 2025
2abe2aa
nn
fandreuz May 7, 2025
8974647
no cleanname
fandreuz May 7, 2025
dcd97c7
missing const
fandreuz May 7, 2025
1d049c5
nn
fandreuz May 10, 2025
7a1895a
stack-allocated
fandreuz May 10, 2025
7e575b4
stuff
fandreuz May 14, 2025
6f3e498
fix
fandreuz May 14, 2025
42879e3
bara's fix
fandreuz May 14, 2025
2e1c15b
cc
fandreuz May 14, 2025
a5c24b0
ops
fandreuz May 14, 2025
7e0a452
new stuff
fandreuz May 14, 2025
703950b
trigger
fandreuz May 14, 2025
1e83770
Merge branch 'coredump-gha' into sim-294-keep-handle-while-patching
fandreuz May 14, 2025
ff43185
test fix
fandreuz May 15, 2025
b220a20
comment
fandreuz May 15, 2025
a7103d0
op
fandreuz May 15, 2025
4d5645f
not here
fandreuz May 16, 2025
873ce86
dlinfo+dladdr
fandreuz May 16, 2025
ffdeb52
revert
fandreuz May 16, 2025
01fc8eb
review comments
fandreuz May 16, 2025
017ff0d
nn
fandreuz May 16, 2025
30364f0
initialize when needed
fandreuz May 16, 2025
9ae2c0f
cc
fandreuz May 19, 2025
7acb726
no dwarf
fandreuz May 19, 2025
d93698b
nn
fandreuz May 19, 2025
00ac27e
amend comment
fandreuz May 19, 2025
41d89d1
rename isValidHandle
fandreuz May 19, 2025
9191dc2
Cleanup
apangin May 20, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 16 additions & 1 deletion src/codeCache.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -29,12 +29,14 @@ size_t NativeFunc::usedMemory(const char* name) {


CodeCache::CodeCache(const char* name, short lib_index, bool imports_patchable,
const void* min_address, const void* max_address) {
const void* min_address, const void* max_address,
const char* image_base) {
_name = NativeFunc::create(name, -1);
_lib_index = lib_index;
_min_address = min_address;
_max_address = max_address;
_text_base = NULL;
_image_base = image_base;

_plt_offset = 0;
_plt_size = 0;
Expand Down Expand Up @@ -311,3 +313,16 @@ size_t CodeCache::usedMemory() {
}
return bytes;
}

bool CodeCache::isValidHandle(void* handle) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is only called from the UnloadProtection

I think it makes sense to move this function implementation into that class as a private method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this rather belongs to CodeCache, it's just a check against a private field of CodeCache. Also, it's called in the UnloadProtection ctor, thus the object is not fully constructed when we need it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could eliminate this inversion by introducing a factory method:

-UnloadProtection handle = UnloadProtection(cc);
+UnloadProtection handle = cc->protectUnload();

Then, cc is free to access its private fields before constructing the UnloadProtection.

Copy link
Contributor Author

@fandreuz fandreuz Apr 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was exactly the previous implementation, before I got this comment.

See 5378fcf

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this assignment? UnloadProtection handle(cc) is simpler.

I'd say that checking dlopen handle in CodeCache looks unnatural. Keep in mind that CodeCache is an abstraction that is not necessarily related to a loaded library (e.g. CodeCache for JVM runtime stubs or CodeCache for kernel symbols).

Copy link
Contributor Author

@fandreuz fandreuz Apr 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#ifdef __linux__
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeCache is just a data structure, it's not good to have platform-dependent code here.
dlopen, dlinfo and even more so, mapping-related stuff are more suitable for Symbols class, whose implementation is platform-dependent. Moreover, in your implementation, that responsibility is already split across CodeCache and Symbols: the code calls Symbols::isMainExecutable and Symbols::isLoader which should not be public functions in the first place, since these are platform-specific implementation details.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

struct link_map* map;
// validate that the current loaded library is the same library that was observed during the /proc/self/maps processing
if (handle != NULL && dlinfo(handle, RTLD_DI_LINKMAP, &map) == 0) {
return _image_base == (const char*)map->l_addr;
}
return false;
#else
return handle != NULL;
#endif
}
16 changes: 15 additions & 1 deletion src/codeCache.h
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,11 @@
#define _CODECACHE_H

#include <jvmti.h>
#include <dlfcn.h>

#ifdef __linux__
#include <link.h>
#endif


#define NO_MIN_ADDRESS ((const void*)-1)
Expand Down Expand Up @@ -107,6 +112,8 @@ class CodeCache {
const void* _min_address;
const void* _max_address;
const char* _text_base;
// TODO: Cleanup usages of lib.image_base
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this leftover?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something we should do, but could pollute this PR. Basically there are some cases where we pass both cc and lib.image_base to some function, which we could now simplify to passing only cc. What's your preference?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since image_base is added to the CodeCache in the current PR, we could include this instead of keeping a TODO comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const char* _image_base;

unsigned int _plt_offset;
unsigned int _plt_size;
Expand All @@ -131,7 +138,8 @@ class CodeCache {
short lib_index = -1,
bool imports_patchable = false,
const void* min_address = NO_MIN_ADDRESS,
const void* max_address = NO_MAX_ADDRESS);
const void* max_address = NO_MAX_ADDRESS,
const char* image_base = NULL);

~CodeCache();

Expand All @@ -147,6 +155,10 @@ class CodeCache {
return _max_address;
}

const char* imageBase() const {
return _image_base;
}

bool contains(const void* address) const {
return address >= _min_address && address < _max_address;
}
Expand Down Expand Up @@ -201,6 +213,8 @@ class CodeCache {
void setDwarfTable(FrameDesc* table, int length);
FrameDesc* findFrameDesc(const void* pc);

bool isValidHandle(void* handle) const;

size_t usedMemory();
};

Expand Down
10 changes: 10 additions & 0 deletions src/hooks.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
#include "cpuEngine.h"
#include "mallocTracer.h"
#include "profiler.h"
#include "symbols.h"


#define ADDRESS_OF(sym) ({ \
Expand Down Expand Up @@ -182,11 +183,20 @@ void Hooks::patchLibraries() {

while (_patched_libs < native_lib_count) {
CodeCache* cc = (*native_libs)[_patched_libs++];
void* handle;
if (!Symbols::isSafeToPatch(cc, &handle)) {
continue;
}

if (!cc->contains((const void*)Hooks::init)) {
// Let libasyncProfiler always use original dlopen
cc->patchImport(im_dlopen, (void*)dlopen_hook);
}
cc->patchImport(im_pthread_create, (void*)pthread_create_hook);
cc->patchImport(im_pthread_exit, (void*)pthread_exit_hook);

if (handle != NULL) {
dlclose(handle);
}
}
}
9 changes: 9 additions & 0 deletions src/mallocTracer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
#include "os.h"
#include "profiler.h"
#include "tsc.h"
#include "symbols.h"
#include <dlfcn.h>
#include <string.h>

Expand Down Expand Up @@ -153,6 +154,10 @@ void MallocTracer::patchLibraries() {

while (_patched_libs < native_lib_count) {
CodeCache* cc = (*native_libs)[_patched_libs++];
void* handle;
if (!Symbols::isSafeToPatch(cc, &handle)) {
continue;
}

cc->patchImport(im_malloc, (void*)malloc_hook);
cc->patchImport(im_realloc, (void*)realloc_hook);
Expand All @@ -169,6 +174,10 @@ void MallocTracer::patchLibraries() {
cc->patchImport(im_calloc, (void*)calloc_hook);
cc->patchImport(im_posix_memalign, (void*)posix_memalign_hook);
}

if (handle != NULL) {
dlclose(handle);
}
}
}

Expand Down
37 changes: 37 additions & 0 deletions src/symbols.h
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,24 @@

#include "codeCache.h"
#include "mutex.h"
#include <dlfcn.h>


class Symbols {
private:
static Mutex _parse_lock;
static bool _have_kernel_symbols;
static bool _libs_limit_reported;
static const void* _main_phdr;
static const char* _ld_base;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are Linux-specific and not part of Symbols public interface.
Can be just static variables in symbols_linux.cpp.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


static bool isMainExecutable(const char* image_base, const void* map_end) {
return _main_phdr != NULL && _main_phdr >= image_base && _main_phdr < map_end;
}

static bool isLoader(const char* image_base) {
return _ld_base != NULL && _ld_base == image_base;
}

public:
static void parseKernelSymbols(CodeCache* cc);
Expand All @@ -23,6 +34,32 @@ class Symbols {
static bool haveKernelSymbols() {
return _have_kernel_symbols;
}

static bool isSafeToPatch(CodeCache* cc, void **handle_ptr) {
*handle_ptr = NULL;

if (isMainExecutable(cc->imageBase(), cc->maxAddress()) || isLoader(cc->imageBase())) {
return true;
}

// Protect library from unloading while parsing in-memory ELF program headers.
// Also, dlopen() ensures the library is fully loaded.
*handle_ptr = dlopen(cc->name(), RTLD_LAZY | RTLD_NOLOAD);
if (cc->isValidHandle(*handle_ptr)) {
// Up to the user to dlclose the handle
return true;
}

if (*handle_ptr != NULL) {
dlclose(*handle_ptr);
*handle_ptr = NULL;
}
return false;
}

Symbols();
};

static Symbols symbols;

#endif // _SYMBOLS_H
41 changes: 17 additions & 24 deletions src/symbols_linux.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <dlfcn.h>
#include <elf.h>
#include <errno.h>
#include <unistd.h>
Expand Down Expand Up @@ -64,6 +63,18 @@ static void applyPatch(CodeCache* cc) {}

#endif

Symbols::Symbols() {
_ld_base = (const char*)getauxval(AT_BASE);
if (!_ld_base) {
Log::warn("Cannot determine base address of the loader");
}

_main_phdr = NULL;
dl_iterate_phdr([](struct dl_phdr_info* info, size_t size, void* data) {
*(const void**)data = info->dlpi_phdr;
return 1;
}, &_main_phdr);
}

class SymbolDesc {
private:
Expand Down Expand Up @@ -653,6 +664,8 @@ void ElfParser::addRelocationSymbols(ElfSection* reltab, const char* plt) {
Mutex Symbols::_parse_lock;
bool Symbols::_have_kernel_symbols = false;
bool Symbols::_libs_limit_reported = false;
const void* Symbols::_main_phdr = NULL;
const char* Symbols::_ld_base = NULL;
static std::unordered_set<u64> _parsed_inodes;

void Symbols::parseKernelSymbols(CodeCache* cc) {
Expand Down Expand Up @@ -776,23 +789,12 @@ void Symbols::parseLibraries(CodeCacheArray* array, bool kernel_symbols) {
std::unordered_map<u64, SharedLibrary> libs;
collectSharedLibraries(libs, MAX_NATIVE_LIBS - array->count());

const char* ld_base = (const char*)getauxval(AT_BASE);
if (!ld_base) {
Log::warn("Cannot determine base address of the loader");
}

const void* main_phdr = NULL;
dl_iterate_phdr([](struct dl_phdr_info* info, size_t size, void* data) {
*(const void**)data = info->dlpi_phdr;
return 1;
}, &main_phdr);

for (auto& it : libs) {
u64 inode = it.first;
_parsed_inodes.insert(inode);

SharedLibrary& lib = it.second;
CodeCache* cc = new CodeCache(lib.file, array->count(), false, lib.map_start, lib.map_end);
CodeCache* cc = new CodeCache(lib.file, array->count(), false, lib.map_start, lib.map_end, lib.image_base);

// Strip " (deleted)" suffix so that removed library can be reopened
size_t len = strlen(lib.file);
Expand All @@ -812,18 +814,9 @@ void Symbols::parseLibraries(CodeCacheArray* array, bool kernel_symbols) {
// Parse debug symbols first
ElfParser::parseFile(cc, lib.image_base, lib.file, true);

dlerror(); // reset any error from previous dl function calls

// Protect library from unloading while parsing in-memory ELF program headers.
// Also, dlopen() ensures the library is fully loaded.
// Main executable and ld-linux interpreter cannot be dlopen'ed, but dlerror() returns NULL for them on some systems.
void* handle = dlopen(lib.file, RTLD_LAZY | RTLD_NOLOAD);

void* handle;
// Parse main executable and the loader (ld.so) regardless of dlopen result, since they cannot be unloaded.
bool is_main_exe = main_phdr >= lib.image_base && main_phdr < lib.map_end;
bool is_loader = ld_base == lib.image_base;

if (handle != NULL || is_main_exe || is_loader) {
if (Symbols::isSafeToPatch(cc, &handle)) {
ElfParser::parseProgramHeaders(cc, lib.image_base, lib.map_end, OS::isMusl());
}

Expand Down
7 changes: 6 additions & 1 deletion src/symbols_macos.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,17 @@
#ifdef __APPLE__

#include <unordered_set>
#include <dlfcn.h>
#include <string.h>
#include <mach-o/dyld.h>
#include <mach-o/loader.h>
#include <mach-o/nlist.h>
#include "symbols.h"
#include "log.h"

Symbols::Symbols() {
_ld_base = NULL;
_main_phdr = NULL;
}

class MachOParser {
private:
Expand Down Expand Up @@ -127,6 +130,8 @@ class MachOParser {
Mutex Symbols::_parse_lock;
bool Symbols::_have_kernel_symbols = false;
bool Symbols::_libs_limit_reported = false;
const void* Symbols::_main_phdr = NULL;
const char* Symbols::_ld_base = NULL;
static std::unordered_set<const void*> _parsed_libraries;

void Symbols::parseKernelSymbols(CodeCache* cc) {
Expand Down
Loading