Skip to content

Conversation

@QwertyJack
Copy link
Contributor

@QwertyJack QwertyJack commented Jan 9, 2026

Issue

Server crashes with segmentation fault when loading models at startup.

Error stack:

#0  0x0000ffff8050b190 in ?? () from /usr/lib64/libjemalloc.so.2
#1  0x0000000000b7e5e0 in destroy_memory_mapping(MemoryMapping*) ()
#2  0x0000000000b7e7f4 in xllm::StateDictFromSafeTensor::~StateDictFromSafeTensor() ()

Root Cause

destroy_memory_mapping() uses free() to release memory allocated by std::make_unique (i.e., new), mixing C++ and C memory management, causing jemalloc to trigger segmentation fault.

Fix

Change free(mapping) to delete mapping.

Impact

  • ✅ Server starts successfully
  • ✅ Models load correctly (e.g., GLM-4.7-MTP)
  • ✅ Multi-threaded safetensor loading works properly

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

本次变更修复了一个严重的内存管理错误,即在 destroy_memory_mapping 函数中错误地使用了 free() 来释放由 std::make_unique (内部使用 new) 分配的内存。将 free(mapping) 修改为 delete mapping 是完全正确的,它解决了 C 和 C++ 内存管理方式混用导致的服务崩溃问题。此修复直接且有效,没有发现其他问题。

@QwertyJack
Copy link
Contributor Author

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

此拉取请求正确地识别并解决了一个位于 state_dict.cpp 中的严重内存管理错误。使用 new(通过 std::make_unique)分配内存,却使用 free 来释放,这是一个典型的导致未定义行为的 C++ 陷阱。将 free 修改为 delete 是正确的修复方法。此更改有效地解决了报告中提到的段错误问题。

@zhang-minchao
Copy link
Collaborator

您好,为了风格一致性,请尽量把commit message改为英文的(bugfix: xxx),包括PR标题。
Hello, for consistency in style, please change the commit message to English (bugfix: xxx), including the PR title.

Copy link
Collaborator

@RobbieLeung RobbieLeung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@RobbieLeung RobbieLeung changed the title 修复:修正 state_dict.cpp 中的内存管理错误 bugfix: reslove realease error of MemoryMapping. Jan 9, 2026
Issue: destroy_memory_mapping() uses free() to release memory allocated
by new (std::make_unique), causing segmentation fault when loading models.

Fix: Change free(mapping) to delete mapping to use correct memory
deallocation method.

Impact: Fixes server crash during model loading at startup.
@QwertyJack QwertyJack force-pushed the fix/memory-management-bug branch from 5aebda5 to 8ed84a0 Compare January 9, 2026 08:09
@zhang-minchao zhang-minchao self-requested a review January 9, 2026 08:27
@zhang-minchao zhang-minchao changed the title bugfix: reslove realease error of MemoryMapping. bugfix: reslove release error of MemoryMapping. Jan 9, 2026
@liutongxuan liutongxuan merged commit ed5dba4 into jd-opensource:main Jan 9, 2026
9 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants