-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
skip setproctitle in task_runner
on Mac OS
#45124
base: main
Are you sure you want to change the base?
Conversation
On some newer versions of Mac OS setproctitle can cause segfault benoitc/gunicorn#3021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I've started noticed this causing tests on OSX to fail on occassion, but hadn't noticed any runtime issues. Makes sense though.
There are probably a few other cases where we setproctitle (in the dag parser code I just landed inside airflow/dag_processor/
could you update those too?)
else: | ||
from setproctitle import setproctitle | ||
setproctitle("airflow scheduler -- DagFileProcessorManager") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else: | |
from setproctitle import setproctitle | |
setproctitle("airflow scheduler -- DagFileProcessorManager") | |
else: | |
from setproctitle import setproctitle | |
setproctitle("airflow scheduler -- DagFileProcessorManager") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am in contact with setproctitle maintainer during the "Airflow Beach Cleaning" project. I can ask him to comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After short discussion with @dvarrazzo - it's likely this dvarrazzo/py-setproctitle#144 is going to fix it (unreleased yet).
It would be great though to get some more details about those segfaults @jaketf @ashb when you see them happening again ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pytest task_sdk
(locally, not breeze) would trigger it about 10-25% of the time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@potiuk Setproctitle creates a thread internally at import time on macos.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ashb I've run
pytest task_sdk
(clean clone from master branch) hundreds of times on both Intel and Apple Silicone Macs (macOS 15.1.1 and 15.0.1 respectively) but it didn't crash for me. Obviously something is different with my setup but what? Puzzled.
@gershnik That's odd.
I can now almost 100% reproduce this error. M2 Macbook pro here on Sonoma 14.7
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gershnik Native stack trace from Console.app:
VM Region Info: 0x104d60a8e is not in any region. Bytes after previous region: 2703 Bytes before following region: 128370
REGION TYPE START - END [ VSIZE] PRT/MAX SHRMOD REGION DETAIL
MALLOC metadata 104d5c000-104d60000 [ 16K] rw-/rwx SM=COW
---> GAP OF 0x20000 BYTES
__TEXT 104d80000-104d84000 [ 16K] r-x/rwx SM=COW /Users/USER/*/_setproctitle.cpython-312-darwin.so
Application Specific Information:
crashed on child side of fork pre-exec
Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0 libsystem_kernel.dylib 0x1931c95d0 __pthread_kill + 8
1 libsystem_pthread.dylib 0x193201c20 pthread_kill + 288
2 libsystem_c.dylib 0x1930d81e0 raise + 32
3 libpython3.12.dylib 0x1068f0c68 faulthandler_fatal_error + 384
4 libsystem_platform.dylib 0x193232584 _sigtramp + 56
5 libsystem_trace.dylib 0x192f4f0a4 _os_log_preferences_refresh + 40
6 libsystem_trace.dylib 0x192f4fb20 os_log_type_enabled + 712
7 CoreFoundation 0x1932cbab8 _CFBundleCopyLoadedImagePathForPointer + 84
8 CoreFoundation 0x1933898b0 _CFBundleGetBundleWithIdentifier + 164
9 _setproctitle.cpython-312-darwin.so 0x104d82ffc darwin_set_process_title + 84
10 _setproctitle.cpython-312-darwin.so 0x104d838b8 init_ps_display + 208
11 _setproctitle.cpython-312-darwin.so 0x104d8359c spt_setup + 400
12 _setproctitle.cpython-312-darwin.so 0x104d83254 spt_getproctitle + 16
13 libpython3.12.dylib 0x1061f5170 cfunction_vectorcall_NOARGS.llvm.1866380503741956643 + 104
14 libpython3.12.dylib 0x105fe5dfc _PyEval_EvalFrameDefault + 156008
15 libpython3.12.dylib 0x10608b3a0 PyEval_EvalCode + 220
16 libpython3.12.dylib 0x1062148d4 builtin_exec + 396
17 libpython3.12.dylib 0x1061f50b4 cfunction_vectorcall_FASTCALL_KEYWORDS.llvm.1866380503741956643 + 92
18 libpython3.12.dylib 0x105fe9a10 _PyEval_EvalFrameDefault + 171388
19 libpython3.12.dylib 0x105f5c928 _PyObject_VectorcallTstate.llvm.2292412377633951376 + 84
20 libpython3.12.dylib 0x105fa3cbc object_vacall.llvm.2292412377633951376 + 240
21 libpython3.12.dylib 0x105fa3518 PyObject_CallMethodObjArgs + 108
22 libpython3.12.dylib 0x105f6353c PyImport_ImportModuleLevelObject + 3100
23 libpython3.12.dylib 0x105fdcf04 _PyEval_EvalFrameDefault + 119408
24 libpython3.12.dylib 0x1061dd854 method_vectorcall.llvm.12955693216709424543 + 296
25 libpython3.12.dylib 0x105fea118 _PyEval_EvalFrameDefault + 173188
26 libpython3.12.dylib 0x105ffb244 _PyObject_Call_Prepend + 296
27 libpython3.12.dylib 0x105ffac48 slot_tp_call + 116
28 libpython3.12.dylib 0x105fe615c _PyEval_EvalFrameDefault + 156872
29 libpython3.12.dylib 0x105ffb244 _PyObject_Call_Prepend + 296
30 libpython3.12.dylib 0x105ffac48 slot_tp_call + 116
31 libpython3.12.dylib 0x105fe9b48 _PyEval_EvalFrameDefault + 171700
32 libpython3.12.dylib 0x105ffb244 _PyObject_Call_Prepend + 296
33 libpython3.12.dylib 0x105ffac48 slot_tp_call + 116
34 libpython3.12.dylib 0x105fe615c _PyEval_EvalFrameDefault + 156872
35 libpython3.12.dylib 0x105ffb244 _PyObject_Call_Prepend + 296
36 libpython3.12.dylib 0x105ffac48 slot_tp_call + 116
37 libpython3.12.dylib 0x105fe615c _PyEval_EvalFrameDefault + 156872
38 libpython3.12.dylib 0x105ffb244 _PyObject_Call_Prepend + 296
39 libpython3.12.dylib 0x105ffac48 slot_tp_call + 116
40 libpython3.12.dylib 0x105fe615c _PyEval_EvalFrameDefault + 156872
41 libpython3.12.dylib 0x10608b3a0 PyEval_EvalCode + 220
42 libpython3.12.dylib 0x10608b1f4 run_mod.llvm.6674925059613253997 + 284
43 libpython3.12.dylib 0x106106730 pyrun_file + 156
44 libpython3.12.dylib 0x106105e70 _PyRun_SimpleFileObject + 268
45 libpython3.12.dylib 0x1061000fc _PyRun_AnyFileObject + 80
46 libpython3.12.dylib 0x1060fedd4 pymain_run_file_obj + 164
47 libpython3.12.dylib 0x1060fe438 pymain_run_file + 72
48 libpython3.12.dylib 0x1060fc774 Py_RunMain + 1124
49 libpython3.12.dylib 0x1060dcf7c pymain_main + 456
50 libpython3.12.dylib 0x1060dcda8 Py_BytesMain + 40
51 dyld 0x192e77154 start + 2476
and
-----------
Full Report
-----------
{"app_name":"python3.12","timestamp":"2025-01-07 15:12:30.00 +0000","app_version":"","slice_uuid":"4c4c44ea-5555-3144-a121-b75170b036a4","build_version":"","platform":1,"share_with_app_devs":0,"is_first_party":1,"bug_type":"309","os_version":"macOS 14.7 (23H124)","roots_installed":0,"incident_id":"38B0B6A5-AC4B-438D-9ED5-DDD274E676C1","name":"python3.12"}
{
"uptime" : 510000,
"procRole" : "Unspecified",
"version" : 2,
"userID" : 501,
"deployVersion" : 210,
"modelCode" : "Mac14,5",
"coalitionID" : 745,
"osVersion" : {
"train" : "macOS 14.7",
"build" : "23H124",
"releaseType" : "User"
},
"captureTime" : "2025-01-07 15:12:30.1013 +0000",
"codeSigningMonitor" : 1,
"incident" : "38B0B6A5-AC4B-438D-9ED5-DDD274E676C1",
"pid" : 51654,
"translated" : false,
"cpuType" : "ARM-64",
"roots_installed" : 0,
"bug_type" : "309",
"procLaunch" : "2025-01-07 15:12:30.0878 +0000",
"procStartAbsTime" : 12276109475883,
"procExitAbsTime" : 12276109794958,
"procName" : "python3.12",
"procPath" : "\/Users\/USER\/*\/python3.12",
"parentProc" : "python",
"parentPid" : 51646,
"coalitionName" : "com.github.wez.wezterm",
"crashReporterKey" : "5BB32C2E-55C1-DCA5-AE8B-3531EFE6FBEE",
"responsiblePid" : 804,
"responsibleProc" : "wezterm-gui",
"codeSigningID" : "-",
"codeSigningTeamID" : "",
"codeSigningFlags" : 570556961,
"codeSigningValidationCategory" : 10,
"codeSigningTrustLevel" : 4294967295,
"instructionByteStream" : {"beforePC":"fyMD1f17v6n9AwCRd+D\/l78DAJH9e8Go\/w9f1sADX9YQKYDSARAA1A==","atPC":"AwEAVH8jA9X9e7+p\/QMAkWzg\/5e\/AwCR\/XvBqP8PX9bAA1\/WcAqA0g=="},
"wakeTime" : 1751,
"sleepWakeUUID" : "92850A44-01E1-4036-91AC-9437DCBBCA46",
"sip" : "enabled",
"vmRegionInfo" : "0x104d60a8e is not in any region. Bytes after previous region: 2703 Bytes before following region: 128370\n REGION TYPE START - END [ VSIZE] PRT\/MAX SHRMOD REGION DETAIL\n MALLOC metadata 104d5c000-104d60000 [ 16K] rw-\/rwx SM=COW \n---> GAP OF 0x20000 BYTES\n __TEXT 104d80000-104d84000 [ 16K] r-x\/rwx SM=COW \/Users\/USER\/*\/_setproctitle.cpython-312-darwin.so",
"exception" : {"codes":"0x0000000000000001, 0x0000000104d60a8e","rawCodes":[1,4376103566],"type":"EXC_BAD_ACCESS","signal":"SIGSEGV","subtype":"KERN_INVALID_ADDRESS at 0x0000000104d60a8e"},
"termination" : {"flags":0,"code":11,"namespace":"SIGNAL","indicator":"Segmentation fault: 11","byProc":"python3.12","byPid":51654},
"vmregioninfo" : "0x104d60a8e is not in any region. Bytes after previous region: 2703 Bytes before following region: 128370\n REGION TYPE START - END [ VSIZE] PRT\/MAX SHRMOD REGION DETAIL\n MALLOC metadata 104d5c000-104d60000 [ 16K] rw-\/rwx SM=COW \n---> GAP OF 0x20000 BYTES\n __TEXT 104d80000-104d84000 [ 16K] r-x\/rwx SM=COW \/Users\/USER\/*\/_setproctitle.cpython-312-darwin.so",
"asi" : {"libsystem_c.dylib":["crashed on child side of fork pre-exec"]},
"extMods" : {"caller":{"thread_create":0,"thread_set_state":0,"task_for_pid":0},"system":{"thread_create":0,"thread_set_state":0,"task_for_pid":0},"targeted":{"thread_create":0,"thread_set_state":0,"task_for_pid":0},"warnings":0},
"faultingThread" : 0,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ashb Thank you, this is super helpful! (I am still unable to reproduce this even once - just tried it again after updating macOS)
With regards to threads, setproctitle doesn't by itself create any threads. Apple frameworks do so internally for their XPC with Launch Services but those sit dormant unless functionality that is using them is invoked. Also see below.
The crash happens in a child process post-fork on a main thread, early during setproctitle initialization in CFBundleGetBundleWithIdentifier call. It is called by setproctitle with a static argument1:
CFBundleGetBundleWithIdentifier(CFSTR("com.apple.LaunchServices"))
so it references no caller-supplied memory that can become invalid somehow. Thus, it is the internal memory of CoreFoundation that is somehow corrupted at the time of this call. In other words the crash is "impossible" unless CoreFoundation itself is in a broken state.
Also note that any threads Apple might create hasn't been started yet - this call happens long before such functionality is invoked.
All of this, combined with the fact that the crash is very non-deterministic suggests that setproctitle is a victim here of something (potentially itself) using Apple APIs on another thread in parallel with fork.
So the question is whether this is what is going on. Are there any calls to to setproctitle (including importing it) or any other Apple-using library in the parent process that can happen in parallel with fork?
[Update]
@potiuk - just realized that your comment indicates that this is actually a known issue that has other manifestations, correct?
Footnotes
-
CFSTR
is an Apple macro to produce a statically allocated CFString constant ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this one is slightly odd. It's def triggerable 100% of the time for me.
Acording to python threading
module there is only a single thread at the point of calling os.fork
.
The odd thing here is that if I import setproctitle
eagerly before fork, then the SIGSEGV goes away, but Py 3.12 now starts complaining about "This process (pid=96305) is multi-threaded, use of fork() may lead to deadlocks in the child."
Co-authored-by: Ash Berlin-Taylor <[email protected]>
On some newer versions of Mac OS setproctitle can cause segfault benoitc/gunicorn#3021
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.