Skip to content

[moe] Add multi-scale AdamH vs Adam isoflop experiment #4610

[moe] Add multi-scale AdamH vs Adam isoflop experiment

[moe] Add multi-scale AdamH vs Adam isoflop experiment #4610

Workflow file for this run

name: Marin - Integration Test
on:
push:
branches:
- main
pull_request:
workflow_dispatch:
jobs:
marin-itest:
if: github.event_name == 'push' || github.event.pull_request.head.repo.full_name == github.repository
runs-on: ubuntu-latest
# uv has to resolve + download large binary wheels (jax/torch). 10 minutes wasn't
# enough for cache misses on GitHub's runners, so give ourselves more runway.
timeout-minutes: 25
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
steps:
- name: Checkout code
uses: actions/checkout@v3
- name: Set up Python 3.12
uses: actions/setup-python@v4
with:
python-version: "3.12"
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: "22"
- name: Install uv
uses: astral-sh/setup-uv@v7
with:
enable-cache: true
- name: Install dependencies
# `--no-default-groups` keeps uv from also installing every workspace package's
# dev/docs/test groups (e.g. `levanter[docs,test,dev]`), which saves multiple
# minutes and avoids re-downloading CUDA wheels unnecessarily on cold caches.
run: uv sync --all-packages --extra=cpu --extra=dedup --no-default-groups
- name: Check df -h
run: df -h
- name: Give Ray tmp space
run: sudo mkdir -p /mnt/ray && sudo chmod 777 /mnt/ray
- name: Run the quickstart script
shell: bash -l {0}
# N.B. You _must not_ use `uv run` here, as that triggers weird behavior from Ray
# https://github.com/ray-project/ray/issues/54344
run: .venv/bin/python tests/integration_test.py
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
JAX_TRACEBACK_FILTERING: off
WANDB_MODE: offline
RAY_TMPDIR: /mnt/ray/