Skip to content

MINOR Capture heap dump after OOM on CI #19031

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 43 commits into
base: trunk
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
4baadb0
ignore the GC pause thing for now
mumrah Feb 25, 2025
d1e4af0
set the jvm arg in the right place
mumrah Feb 26, 2025
641c0df
increase worker heap size
mumrah Feb 26, 2025
76147b9
reduce heap to try to get a GC pause
mumrah Feb 26, 2025
95240c1
speed up testing a bit for now
mumrah Feb 26, 2025
a22eb2a
disable caching
mumrah Feb 26, 2025
f4121f1
disable cache
mumrah Feb 26, 2025
d6e1e56
increase JUnit heap
mumrah Feb 26, 2025
fe16ad0
empty
mumrah Feb 26, 2025
75dfd65
only run flaky 17
mumrah Feb 26, 2025
ce1d399
really disable the build cache
mumrah Feb 26, 2025
59f5244
try getting a heap dump
mumrah Feb 26, 2025
3c3bbdb
empty
mumrah Feb 26, 2025
7a056c9
empty
mumrah Feb 27, 2025
3d7b910
empty
mumrah Feb 27, 2025
ac2357b
force an oom
mumrah Feb 27, 2025
5b5669b
checkstyle
mumrah Feb 27, 2025
b1d6a6e
checkstyle
mumrah Feb 27, 2025
9eddadd
always archive heap dump
mumrah Feb 27, 2025
3812530
dont retain heap dumps very long
mumrah Feb 27, 2025
dec43a7
find
mumrah Feb 27, 2025
8f977fc
wip
mumrah Feb 27, 2025
fc49738
remove forced OOM
mumrah Feb 27, 2025
98131ff
Merge remote-tracking branch 'origin/trunk' into tmp-oom-in-flaky-test
mumrah Feb 27, 2025
c86117b
wip
mumrah Feb 27, 2025
d44ea8c
empty
mumrah Feb 27, 2025
32b6815
back to 2g heap
mumrah Feb 27, 2025
46a8dad
add a bunch of runs
mumrah Feb 28, 2025
5d0dceb
fix conflict
mumrah Feb 28, 2025
dfe417d
keep trying
mumrah Feb 28, 2025
ef377e5
still trying
mumrah Feb 28, 2025
dfa1359
lower count
mumrah Feb 28, 2025
b8d1c46
remove ls
mumrah Mar 3, 2025
3d1bf90
empty
mumrah Mar 3, 2025
8f6c75c
empty
mumrah Mar 3, 2025
6829676
empty
mumrah Mar 3, 2025
9dcbe73
revert some things
mumrah Mar 3, 2025
603d6a0
empty
mumrah Mar 3, 2025
b8f66a4
Merge remote-tracking branch 'origin/trunk' into tmp-oom-in-flaky-test
mumrah Mar 3, 2025
ef070f3
increase retention
mumrah Mar 3, 2025
6d64a9b
remove redundant jvm arg
mumrah Mar 7, 2025
a3e0227
Merge branch 'trunk-cached' into tmp-oom-in-flaky-test
mumrah Mar 7, 2025
70fdd70
Merge branch 'trunk-cached' into tmp-oom-in-flaky-test
mumrah Mar 12, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion .github/actions/run-gradle/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,10 @@ runs:
run: |
set +e
./.github/scripts/thread-dump.sh &
timeout ${TIMEOUT_MINUTES}m ./gradlew --build-cache --continue --no-scan \
mkdir -p heap-dumps
HEAP_DUMP_DIR=$(readlink -f heap-dumps)
timeout ${TIMEOUT_MINUTES}m ./gradlew --continue --no-scan \
-Dorg.gradle.jvmargs="-Xmx4g -Xss4m -XX:+UseParallelGC -XX:+ExitOnOutOfMemoryError -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=$HEAP_DUMP_DIR" \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can add comment to gradle.properties to remind us to keep the memory configs consistent?

-PtestLoggingEvents=started,passed,skipped,failed \
-PmaxParallelForks=4 \
-PmaxTestRetries=$TEST_RETRIES -PmaxTestRetryFailures=10 \
Expand All @@ -96,9 +99,11 @@ runs:
-Pkafka.test.xml.output.dir=$TEST_XML_OUTPUT_DIR \
-Pkafka.cluster.test.repeat=$TEST_REPEAT \
-Pkafka.test.verbose=$TEST_VERBOSE \
-PheapDumpPath=$HEAP_DUMP_DIR \
-PcommitId=xxxxxxxxxxxxxxxx \
$TEST_TASK
exitcode="$?"
find heap-dumps
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it used to debug? if so, maybe using ls can offer more useful output?

echo "exitcode=$exitcode" >> $GITHUB_OUTPUT
- name: Archive build scan (${{ inputs.test-task }})
if: always()
Expand Down
11 changes: 11 additions & 0 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -260,6 +260,17 @@ jobs:
compression-level: 9
if-no-files-found: ignore

- name: Archive Heap Dumps
id: archive-heap-dump
uses: actions/upload-artifact@v4
with:
name: junit-heap-dumps-${{ env.job-variation }}
path: |
heap-dumps/*
compression-level: 9
retention-days: 5
if-no-files-found: ignore

- name: Parse JUnit tests
env:
GITHUB_WORKSPACE: ${{ github.workspace }}
Expand Down
4 changes: 3 additions & 1 deletion build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -53,8 +53,10 @@ ext {

buildVersionFileName = "kafka-version.properties"

heapDumpPath = project.hasProperty('heapDumpPath') ? heapDumpPath : "/tmp"
heapDumpPathArg = "-XX:HeapDumpPath=$heapDumpPath".toString()
defaultMaxHeapSize = "2g"
defaultJvmArgs = ["-Xss4m", "-XX:+UseParallelGC"]
defaultJvmArgs = ["-Xss4m", "-XX:+UseParallelGC", "-XX:+ExitOnOutOfMemoryError", "-XX:+HeapDumpOnOutOfMemoryError", heapDumpPathArg]

// "JEP 403: Strongly Encapsulate JDK Internals" causes some tests to fail when they try
// to access internals (often via mocking libraries). We use `--add-opens` as a workaround
Expand Down