Skip to content

Python crash caused by memory access violation (0xc0000005) in Rust backend of safetensors, failing to hand off a memory pointer to PyTorch #693

@Ark-kun

Description

@Ark-kun

System Info

The PyTorch team has indicated the Rust part of the safetensors library as the root cause of the crash. pytorch/pytorch#145864 (comment)

Please investigate the issue. If you conclude that it does not come from safetwnsors, then please respond to the pyTorch team.

safetensors version: 0.6.2

Information

  • The official example scripts
  • My own modified scripts

Reproduction

Try using ComfyUI with Flux or Qwen image models.
Python process just crashes.

Example stack trace:

EXCEPTION_RECORD:  (.exr -1)
ExceptionAddress: 00007ff9465e3f94 (torch_cpu!at::native::_rowwise_prune+0x0000000000000314)
   ExceptionCode: c0000005 (Access violation)
  ExceptionFlags: 00000000
NumberParameters: 2
   Parameter[0]: 0000000000000000
   Parameter[1]: 0000000000024840
Attempt to read from address 0000000000024840

PROCESS_NAME:  python.exe

READ_ADDRESS:  0000000000024840 

ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%p referenced memory at 0x%p. The memory could not be %s.

EXCEPTION_CODE_STR:  c0000005

EXCEPTION_PARAMETER1:  0000000000000000

EXCEPTION_PARAMETER2:  0000000000024840

STACK_TEXT:  
00000069`bc9ed460 00007ff9`465e487d     : 00000008`00008001 00000069`bc9ed680 00000069`bc9eda40 00000069`bc9ed6e0 : torch_cpu!at::native::_rowwise_prune+0x314
00000069`bc9ed520 00007ff9`470ebc0e     : 00000069`bc9ed680 00007ff9`49220000 00000000`00000000 00000000`00000000 : torch_cpu!at::native::_local_scalar_dense_cpu+0xbd
00000069`bc9ed580 00007ff9`470e1871     : 00000266`8fc7a430 00000069`bc9eda40 00000069`bc9ed680 00000069`bc9ed6e0 : torch_cpu!at::cpu::vdot+0x6ee
00000069`bc9ed5b0 00007ff9`46e190e2     : 00000069`00000000 00000000`00008001 00000000`00000000 00000000`00000000 : torch_cpu!at::cpu::bitwise_xor_outf+0xe561
00000069`bc9ed5e0 00007ff9`487d920a     : 00000069`bc9eda80 00000020`00008001 00000069`bc9ed740 00000000`04000000 : torch_cpu!at::_ops::_local_scalar_dense::redispatch+0x82
00000069`bc9ed650 00007ff9`48812354     : 00000069`bc9ed9d0 00000069`bc9ed9d0 00000069`bc9ed9d0 00000069`bc9ed740 : torch_cpu!torch::autograd::UndefinedGradBackward::apply_with_saved+0x1c7c3a
00000069`bc9ed6b0 00007ff9`46d77eef     : 00000020`00008001 00000069`bc9eda40 00000069`bc9ed9d0 00000269`7a1c2180 : torch_cpu!torch::autograd::UndefinedGradBackward::apply_with_saved+0x200d84
00000069`bc9ed6e0 00007ff9`465e4a2f     : 00000000`00000001 00000069`bc9ed9d0 00000069`bc9ed9d0 00000028`00008001 : torch_cpu!at::_ops::_local_scalar_dense::call+0xef
00000069`bc9ed7d0 00007ff9`474534de     : 00000069`bc9ed858 000067fb`8e7a3a22 00000069`bc9ed9d0 00000000`00000000 : torch_cpu!at::native::item+0x17f
00000069`bc9ed860 00007ff9`474335c1     : 00000266`8fc71ed0 00000069`bc9eda40 00000069`bc9ed9d0 00000069`bc9ed920 : torch_cpu!at::compositeimplicitautograd::where+0x3b0e
00000069`bc9ed890 00007ff9`46c321bf     : 00000020`00008001 00000069`bc9eda40 00000069`bc9ed9d0 00000269`7a1c20b0 : torch_cpu!at::compositeimplicitautograd::broadcast_to_symint+0x33de1
00000069`bc9ed8c0 00007ff9`47878b33     : 00000267`1e033ad8 00000069`bc9eda40 00000069`bc9eda30 00007ff9`460b2597 : torch_cpu!at::_ops::item::call+0xef
00000069`bc9ed9b0 00007ff9`f4ae7607     : 00000069`bc9eda30 00000000`00000000 00000267`1cafcd60 00000266`da1e81c0 : torch_cpu!at::Tensor::item<unsigned char>+0x13
00000069`bc9eda00 00007ff9`f466187d     : 00000000`00000010 00000267`1cafcd60 00000000`00000000 00000269`7a066b70 : torch_python!THPPointer<PyCodeObject>::none+0x287
00000069`bc9eda70 00007ffb`b87ff832     : 00000266`955ab970 00000266`94492e10 00000266`ddbeecb0 00000266`94492e10 : torch_python!isMainPyInterpreter+0x3afd
00000069`bc9ede30 00007ffb`b87b337c     : 00007ffb`b87ff800 00000267`1d0513c0 00000266`dd9265d0 00000000`0000003f : python312!PyErr_SetString+0x8e
00000069`bc9ede60 00007ffb`b8773972     : 00007ff9`f51d4df0 00007ffb`b8d396f0 00000266`90220c98 00000266`956404e2 : python312!PyCodec_DecodeText+0xe0
00000069`bc9ede90 00007ffb`b872e192     : 00000267`1dc1a900 00000069`bc9edfe0 00000267`1d0513c0 00000000`00000002 : python312!PyObject_Call+0xb6
00000069`bc9edee0 00007ffb`b872874c     : 00000267`05a094e0 00000266`956684a0 00000266`90220c18 00000069`bc9ee1a0 : python312!PyEval_EvalFrameDefault+0x4352
00000069`bc9ee110 00007ffb`b877bea6     : 00000069`bc9ee1a0 00000266`956684a0 00000267`05a094e0 00007ffb`b8d2b998 : python312!PyFunction_Vectorcall+0x17c
00000069`bc9ee170 00007ffb`b877a29f     : 00000000`00000000 00000267`1e033ac0 00000267`00000001 00000000`00000001 : python312!PyUnicode_InternFromString+0x1de
00000069`bc9ee1c0 00007ff9`f4b30601     : 00000069`bc9ee440 00000000`00000001 00000267`1e0304a0 00000267`10245850 : python312!PySequence_GetItem+0x47
00000069`bc9ee1f0 00007ff9`f4b31afa     : 00007ffb`b8d3d4a0 00000000`0000ff00 00000000`0000ff00 00000267`1e033ac0 : torch_python!torch::utils::getTHPMemoryFormat+0x2881
00000069`bc9ee370 00007ff9`f4b2fb7c     : 00000069`bc9ee6e0 00000000`0006ff00 00000000`00000000 00000000`00000001 : torch_python!torch::utils::getTHPMemoryFormat+0x3d7a
00000069`bc9ee660 00007ff9`f46ef1a5     : 00000069`bc9ee8f8 00000000`0000ff00 00000000`00000100 00000000`00000000 : torch_python!torch::utils::getTHPMemoryFormat+0x1dfc
00000069`bc9ee880 00007ffb`b8773d08     : 00000000`00000000 00000000`00000ee3 00000000`00000fff 00000266`9470d090 : torch_python!torch::autograd::registerFunctionPreHook+0x1ba25
00000069`bc9eeab0 00007ffb`b8773972     : 00000266`da3d0100 00000000`00000000 00000266`95488040 00000000`0000003f : python312!PyObject_Call+0x44c
00000069`bc9eeae0 00007ffa`a8e30eab     : 00000267`1dc1a7c0 00000069`bc9eebb0 00000069`bc9eedf0 80608a38`e55dbee3 : python312!PyObject_Call+0xb6
00000069`bc9eeb30 00007ffa`a8e2fbba     : 00000069`bc9eec00 00000069`bc9eedf0 00000266`95488040 00000069`bc9eedf0 : _safetensors_rust+0x20eab
00000069`bc9eebd0 00007ffa`a8e219b7     : 00000000`00290004 00000000`00000064 00000269`b29ecd40 00000269`7a1c6970 : _safetensors_rust+0x1fbba
00000069`bc9eec40 00007ffa`a8e29c2f     : 00000269`7a1c6970 00000000`00000308 00000267`10351550 00007ffc`7ae347b1 : _safetensors_rust+0x119b7
00000069`bc9eef40 00007ffa`a8e2afc8     : 00000266`9d4c8b3c 00000266`90220c00 00000069`bc9ef1b0 00007ffa`a8e29e8d : _safetensors_rust+0x19c2f
00000069`bc9ef110 00007ffa`a8e1b752     : 00000266`a3bc1420 00000266`90220c08 00000267`1cff8070 00007ffb`b871d415 : _safetensors_rust+0x1afc8
00000069`bc9ef260 00007ffb`b87a65cb     : 00000000`000000ab 00007ffa`a8e2b263 00000000`00000057 00000000`00000001 : _safetensors_rust+0xb752
00000069`bc9ef2e0 00007ffb`b8729290     : 00000000`000000ab 00000069`bc9ef4f0 00007ffb`b8722048 80000000`00000002 : python312!PyType_Modified+0x863
00000069`bc9ef330 00007ffb`b87291f5     : 00000000`000000ab 00007ffb`b871fb4f 00000069`bc9ef4f0 00000266`9d57ad40 : python312!PyObject_Vectorcall+0xd0
00000069`bc9ef3b0 00007ffb`b872a74f     : 00000266`9d57acf0 00000267`0681be00 00000069`bc9ef4f0 00000266`9d4c8b26 : python312!PyObject_Vectorcall+0x35
00000069`bc9ef3f0 00007ffb`b872874c     : 00000267`05a094e0 00000267`069deb60 00000266`90220188 00000267`06ff30c0 : python312!PyEval_EvalFrameDefault+0x90f
00000069`bc9ef620 00007ffb`b8723b14     : ffffffff`ffffffff 00000000`00000000 00000267`05a094e0 00000267`069deb60 : python312!PyFunction_Vectorcall+0x17c
00000069`bc9ef680 00007ffb`b8773a37     : 00007ffb`b87239ac 00000069`bc9ef8d0 00000267`06acd6c0 00000000`00000000 : python312!PyArg_CheckPositional+0x65c
00000069`bc9ef740 00007ffb`b877392b     : 00000000`00000000 00000069`bc9ef8d0 00000267`06acd6c0 00000267`05a094e0 : python312!PyObject_Call+0x17b
00000069`bc9ef780 00007ffb`b872e192     : 00000267`06ff9c80 00000069`bc9ef8d0 00000267`06acd6c0 00000000`00000001 : python312!PyObject_Call+0x6f
00000069`bc9ef7d0 00007ffb`b872874c     : 00000267`05a094e0 00000266`dab27ce0 00000266`90220020 00000069`bc9efa90 : python312!PyEval_EvalFrameDefault+0x4352
00000069`bc9efa00 00007ffb`b8723b6f     : 00000000`00000000 00000000`00000000 00000267`05a094e0 00000266`dab27ce0 : python312!PyFunction_Vectorcall+0x17c
00000069`bc9efa60 00007ffb`b87739e1     : 00000267`05a094e0 00007ffb`b87b132a 00000000`00000000 00007ffb`b8d3d610 : python312!PyArg_CheckPositional+0x6b7
00000069`bc9efb20 00007ffb`b877392b     : 00000000`00000000 00000000`00000000 00000267`06ffaa00 00000267`05a094e0 : python312!PyObject_Call+0x125
00000069`bc9efb60 00007ffb`b873f9f4     : 00000266`a44b5c60 00000000`00000000 00007ffb`b873f9ac 00000000`00000000 : python312!PyObject_Call+0x6f
00000069`bc9efbb0 00007ffb`b873f85a     : 00000266`a4dc9fa0 00000266`a4dc9fa0 00000000`00000000 00000000`00000000 : python312!PyThreadState_Bind+0x11c
00000069`bc9efbe0 00007ffc`789a1bb2     : 00000266`a44b5ab0 00000000`00000000 00000000`00000000 00000000`00000000 : python312!PyThreadState_Clear+0x23a
00000069`bc9efc10 00007ffc`78e47374     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ucrtbase!thread_start<unsigned int (__cdecl*)(void *),1>+0x42
00000069`bc9efc40 00007ffc`7ae5cc91     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : kernel32!BaseThreadInitThunk+0x14
00000069`bc9efc70 00000000`00000000     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x21


IP_IN_PAGED_CODE: 
torch_cpu!at::native::_rowwise_prune+314
00007ff9`465e3f94 0fb610          movzx   edx,byte ptr [rax]

SYMBOL_NAME:  torch_cpu+6013f94

MODULE_NAME: torch_cpu

IMAGE_NAME:  torch_cpu.dll

STACK_COMMAND: ~58s; .ecxr ; kb

FAILURE_BUCKET_ID:  INVALID_POINTER_READ_c0000005_torch_cpu.dll!Unknown

OS_VERSION:  10.0.19041.1

BUILDLAB_STR:  vb_release

OSPLATFORM_TYPE:  x64

OSNAME:  Windows 10

Expected behavior

I expect PyTorch/safetensors not to crash Python process with memory access vioaltion errors.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions