Skip to content
Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 14 additions & 6 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ permissions: read-all

jobs:
pre-commit:
runs-on: ubuntu-22.04
runs-on: ubuntu-latest-64-cores
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Over-provisioned runner for pre-commit

The pre-commit job runs linters and formatters (tools like black, flake8, isort, etc.), which are inherently single-threaded and typically complete in under a minute. Allocating a 64-core runner here is unlikely to provide any speedup and significantly increases per-minute cost (64-core GitHub-hosted runners cost ~32× more than the standard 2-core runner).

Consider keeping pre-commit on a smaller runner (e.g. ubuntu-latest or ubuntu-22.04) while reserving the larger runner for the heavier Python and R test jobs:

Suggested change
runs-on: ubuntu-latest-64-cores
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@8e8c483db84b4bee98b60c0593521ed34d9990e8 # v6.0.1
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
Expand All @@ -24,7 +24,7 @@ jobs:
strategy:
fail-fast: false
matrix:
os: [ubuntu-22.04]
os: [ubuntu-latest-64-cores]
ver: ['3.7', '3.8', '3.9', '3.10', '3.11', '3.12', '3.13', '3.14']
include:
- os: macos-latest
Expand All @@ -45,6 +45,7 @@ jobs:
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: ${{ matrix.ver }}
cache: 'pip'
env:
PIP_TRUSTED_HOST: "pypi.python.org pypi.org files.pythonhosted.org"

Expand All @@ -62,7 +63,7 @@ jobs:
strategy:
fail-fast: false
matrix:
os: [ubuntu-22.04]
os: [ubuntu-latest-64-cores]
ver: ['4.4.1']

steps:
Expand All @@ -72,19 +73,26 @@ jobs:
with:
r-version: ${{ matrix.ver }}

- name: Cache R packages
uses: actions/cache@v4
with:
path: ~/R/library
key: ${{ runner.os }}-r-${{ matrix.ver }}-${{ hashFiles('R/DESCRIPTION') }}
restore-keys: |
${{ runner.os }}-r-${{ matrix.ver }}-

- name: Install R ${{ matrix.ver }} system dependencies
if: matrix.os == 'ubuntu-22.04'
if: startsWith(matrix.os, 'ubuntu')
run: sudo apt-get update; sudo apt-get install -y libcurl4-openssl-dev qpdf libgit2-dev libharfbuzz-dev libfribidi-dev libwebp-dev

- name: Install R ${{ matrix.ver }} Rlang dependencies
run: |
python3 -m venv path/to/venv
source path/to/venv/bin/activate
python3 -m pip install .
Rscript -e 'install.packages(c("devtools", "remotes"), repos="https://cloud.r-project.org", Ncpus=8)'
Rscript -e 'install.packages(c("devtools", "remotes", "data.table", "caret", "glmnet", "Matrix", "rjson"), repos="https://cloud.r-project.org", Ncpus=64)'
Rscript -e 'devtools::install_deps("R", dependencies=TRUE, repos="https://cloud.r-project.org", upgrade="default")'
R CMD INSTALL R
Comment on lines 68 to 86
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Advertised R library caching not implemented

The PR description states "R library caching: Cache ~/R/library across runs to avoid recompiling packages", but no actions/cache step (or equivalent) for ~/R/library appears anywhere in the R job. Recompiling all R packages from source on every run is one of the primary bottlenecks in R CI, so the missing cache means the stated optimization was not applied.

Adding a step like the following before the dependency install would realize the advertised benefit:

- name: Cache R packages
  uses: actions/cache@v4
  with:
    path: ~/R/library
    key: ${{ runner.os }}-r-${{ matrix.ver }}-${{ hashFiles('R/DESCRIPTION') }}
    restore-keys: |
      ${{ runner.os }}-r-${{ matrix.ver }}-

Rscript -e 'install.packages(c("data.table", "caret", "glmnet", "Matrix", "rjson"), repos="https://cloud.r-project.org", Ncpus=8)'

- name: Execute R tests
run: |
Expand Down
4 changes: 2 additions & 2 deletions test_runner
Original file line number Diff line number Diff line change
Expand Up @@ -16,15 +16,15 @@ install_extensions() {
}

run_tests() {
cd test/core && PYTHONPATH=`pwd`/../../ python3 run_tests.py --num-parallel 8 && cd ../../
cd test/core && PYTHONPATH=`pwd`/../../ python3 run_tests.py --num-parallel 64 && cd ../../
}

# We run realtime cards tests separately because there these tests validate the asynchronous updates to the
# information stored in the datastore. So if there are other processes starving resources then these tests will
# surely fail since a lot of checks have timeouts.
run_runtime_card_tests() {
CARD_GRAPHS="small-foreach,small-parallel,nested-branches,single-linear-step,simple-foreach"
cd test/core && PYTHONPATH=`pwd`/../../ python3 run_tests.py --num-parallel 8 --contexts python3-all-local-cards-realtime --graphs $CARD_GRAPHS && cd ../../
cd test/core && PYTHONPATH=`pwd`/../../ python3 run_tests.py --num-parallel 64 --contexts python3-all-local-cards-realtime --graphs $CARD_GRAPHS && cd ../../
Comment on lines 26 to +27
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Timing-sensitive card tests at max parallelism

The comment directly above this function warns that realtime card tests "will surely fail" if other processes are starving resources, since the tests rely on timeouts for asynchronous datastore validation. While CPU starvation is much less likely with 64 cores, running 64 parallel test processes can still introduce memory pressure, I/O contention, and contention on the local datastore (which handles the async card updates being checked). This may cause flaky failures — which is exactly what the comment warns about.

Consider using a more conservative value for this specific function (e.g. --num-parallel 16 or --num-parallel 32) to leave headroom for I/O and datastore operations:

Suggested change
CARD_GRAPHS="small-foreach,small-parallel,nested-branches,single-linear-step,simple-foreach"
cd test/core && PYTHONPATH=`pwd`/../../ python3 run_tests.py --num-parallel 8 --contexts python3-all-local-cards-realtime --graphs $CARD_GRAPHS && cd ../../
cd test/core && PYTHONPATH=`pwd`/../../ python3 run_tests.py --num-parallel 64 --contexts python3-all-local-cards-realtime --graphs $CARD_GRAPHS && cd ../../
cd test/core && PYTHONPATH=`pwd`/../../ python3 run_tests.py --num-parallel 16 --contexts python3-all-local-cards-realtime --graphs $CARD_GRAPHS && cd ../../

}

install_deps && install_extensions && run_tests && run_runtime_card_tests
Loading