Skip to content

Commit fac4cbc

Browse files
committed
Merge branch 'main' into restore_storage_options
2 parents ee34d7c + d8bf265 commit fac4cbc

113 files changed

Lines changed: 3660 additions & 918 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.claude/skills/memray/SKILL.md

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
---
2+
name: memray
3+
description: Profile the memory usage of a Python script using memray and visualize a temporal flamegraph in the browser. Use when the user wants to investigate memory consumption, find leaks, or understand allocation patterns.
4+
compatibility: Requires the pixi profiling environment (pixi run -e profiling). Supports Linux and macOS.
5+
allowed-tools: Bash(pixi run -e profiling memray-run:*) Bash(pixi run -e profiling memray-flame:*) Bash(open:*) Bash(xdg-open:*) Bash(python -m webbrowser:*)
6+
---
7+
8+
## Steps
9+
10+
1. Ask the user which script to profile (full or relative path).
11+
12+
2. Run the script under memray:
13+
14+
```bash
15+
pixi run -e profiling memray-run script.py
16+
```
17+
18+
This produces a binary file named `memray-script.py.<pid>.bin` in the current directory.
19+
20+
3. Generate the flamegraph HTML report from the `.bin` file:
21+
22+
```bash
23+
pixi run -e profiling memray-flame memray-script.py.<pid>.bin
24+
```
25+
26+
Replace `<pid>` with the actual PID shown in the filename. This writes `memray-flamegraph-script.py.<pid>.html`.
27+
28+
4. Open the report in the browser:
29+
- macOS: `open memray-flamegraph-script.py.<pid>.html`
30+
- Linux: `xdg-open memray-flamegraph-script.py.<pid>.html`
31+
- Either: `python -m webbrowser memray-flamegraph-script.py.<pid>.html`
32+
33+
## Notes
34+
35+
- The `--temporal` flag (included in `memray-flame`) shows memory over time, not just peak — use this to spot leaks and allocation bursts.
36+
- To find the `.bin` file if unsure of the name: `ls memray-*.bin`
37+
- To compare runs, save the previous report: `cp memray-flamegraph-script.py.<pid>.html memray-flamegraph-before.html`

.claude/skills/profimp/SKILL.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
---
2+
name: profimp
3+
description: Profile Python import time using profimp and open a waterfall HTML report. Use when investigating slow startup or wanting to identify which imports are most expensive.
4+
compatibility: Requires profimp (available as a pixi dependency). macOS or Linux.
5+
allowed-tools: Bash(profimp:*) Bash(open:*) Bash(xdg-open:*) Bash(python -m webbrowser:*)
6+
---
7+
8+
## Steps
9+
10+
1. Ask what to profile. Suggest common patterns for this repo:
11+
- `import spatialdata`
12+
- `from spatialdata import SpatialData`
13+
- `from spatialdata_io import xenium`
14+
15+
2. Run:
16+
17+
```bash
18+
profimp --html "<import_stmt>" > /tmp/profimp.html
19+
```
20+
21+
3. Open the report:
22+
- macOS: `open /tmp/profimp.html`
23+
- Linux: `xdg-open /tmp/profimp.html`
24+
- Either: `python -m webbrowser /tmp/profimp.html`
25+
26+
## Notes
27+
28+
- The report is a waterfall chart showing every sub-import and its timing.
29+
- To compare before/after: `cp /tmp/profimp.html /tmp/profimp-before.html` before re-running.
30+
- `pixi run python -m profimp` also works if `profimp` is not on PATH.

.claude/skills/pyspy/SKILL.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
---
2+
name: pyspy
3+
description: Profile the execution time of a Python script using py-spy and visualize the result with speedscope. Use when the user wants to benchmark performance, find slow code paths, or profile CPU time.
4+
compatibility: Requires the pixi profiling environment (pixi run -e profiling). Speedscope must be installed separately (npm install -g speedscope). sudo is required on macOS.
5+
allowed-tools: Bash(pixi run -e profiling pyspy:*) Bash(pixi run -e profiling speedscope:*) Bash(sudo pixi run -e profiling pyspy:*)
6+
---
7+
8+
## Steps
9+
10+
1. Ask the user which script to profile (full or relative path).
11+
12+
2. Run py-spy to record the profile. The output is always written to `profile.speedscope.json` in the current directory.
13+
14+
**Linux** (no sudo needed):
15+
16+
```bash
17+
pixi run -e profiling pyspy script.py
18+
```
19+
20+
**macOS** (sudo required — py-spy needs to attach to the process):
21+
22+
```bash
23+
sudo pixi run -e profiling pyspy script.py
24+
```
25+
26+
If sudo fails to find the pixi environment, use absolute paths:
27+
28+
```bash
29+
sudo /path/to/.pixi/envs/profiling/bin/py-spy record --gil \
30+
-o profile.speedscope.json --format speedscope \
31+
-- /path/to/.pixi/envs/profiling/bin/python script.py
32+
```
33+
34+
3. Open the result in speedscope:
35+
```bash
36+
pixi run -e profiling speedscope
37+
```
38+
This opens `profile.speedscope.json` in the browser via the local speedscope CLI.
39+
40+
## Notes
41+
42+
- If the speedscope view is blank, switch threads using the thread selector in the top-right corner.
43+
- To save a profile before overwriting: `cp profile.speedscope.json profile-before.speedscope.json`
44+
- `--gil` records only time when the GIL is held (Python-level CPU time). Drop it to include C extension time.
45+
- speedscope must be installed globally: `npm install -g speedscope`

.github/workflows/release.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,9 @@ jobs:
99
runs-on: ubuntu-latest
1010
if: startsWith(github.ref, 'refs/tags/v')
1111
steps:
12-
- uses: actions/checkout@v3
12+
- uses: actions/checkout@v6
1313
- name: Set up Python 3.12
14-
uses: actions/setup-python@v4
14+
uses: actions/setup-python@v6
1515
with:
1616
python-version: "3.12"
1717
cache: pip

.github/workflows/test.yaml

Lines changed: 17 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -13,57 +13,55 @@ jobs:
1313
runs-on: ${{ matrix.os }}
1414
defaults:
1515
run:
16-
shell: bash -e {0}
16+
shell: bash # bash also on windows
1717

1818
strategy:
1919
fail-fast: false
2020
matrix:
2121
include:
22-
- {os: windows-latest, python: "3.11", dask-version: "2025.2.0", name: "Dask 2025.2.0"}
23-
- {os: windows-latest, python: "3.13", dask-version: "latest", name: "Dask latest"}
24-
- {os: ubuntu-latest, python: "3.11", dask-version: "latest", name: "Dask latest"}
25-
- {os: ubuntu-latest, python: "3.13", dask-version: "latest", name: "Dask latest"}
26-
- {os: macos-latest, python: "3.11", dask-version: "latest", name: "Dask latest"}
27-
- {os: macos-latest, python: "3.13", prerelease: "allow", name: "Python 3.13 (pre-release)"}
22+
- {os: windows-latest, python: "3.11", dask-version: "2025.12.0", name: "min dask"}
23+
- {os: windows-latest, python: "3.14", dask-version: "latest"}
24+
- {os: ubuntu-latest, python: "3.11", dask-version: "latest"}
25+
- {os: ubuntu-latest, python: "3.14", dask-version: "latest"}
26+
- {os: macos-latest, python: "3.11", dask-version: "latest"}
27+
- {os: macos-latest, python: "3.14", prerelease: "allow", name: "prerelease"}
2828
env:
2929
OS: ${{ matrix.os }}
3030
PYTHON: ${{ matrix.python }}
3131
DASK_VERSION: ${{ matrix.dask-version }}
3232
PRERELEASE: ${{ matrix.prerelease }}
3333

3434
steps:
35-
- uses: actions/checkout@v2
36-
- uses: astral-sh/setup-uv@v5
35+
- uses: actions/checkout@v6
36+
- uses: astral-sh/setup-uv@v7
3737
id: setup-uv
3838
with:
3939
version: "latest"
4040
python-version: ${{ matrix.python }}
4141
- name: Install dependencies
4242
run: |
4343
if [[ "${PRERELEASE}" == "allow" ]]; then
44-
uv sync --extra test
45-
: # uv sync --extra test --prerelease ${PRERELEASE}
46-
uv pip install git+https://github.com/scverse/anndata.git
47-
uv pip install --prerelease allow pandas
48-
else
49-
uv sync --extra test
44+
sed -i '' 's/requires-python.*//' pyproject.toml # otherwise uv complains that anndata requires python>=3.12 and we only do >=3.11 😱
45+
uv add git+https://github.com/scverse/anndata.git
46+
uv add pandas>=3.dev0
5047
fi
5148
if [[ -n "${DASK_VERSION}" ]]; then
5249
if [[ "${DASK_VERSION}" == "latest" ]]; then
53-
uv pip install --upgrade dask
50+
uv add dask
5451
else
55-
uv pip install dask==${DASK_VERSION}
52+
uv add dask==${DASK_VERSION}
5653
fi
5754
fi
55+
uv sync --group=test
5856
- name: Test
5957
env:
6058
MPLBACKEND: agg
6159
PLATFORM: ${{ matrix.os }}
6260
DISPLAY: :42
6361
run: |
64-
uv run pytest --cov --color=yes --cov-report=xml
62+
uv run pytest --cov --color=yes --cov-report=xml -n auto --dist worksteal
6563
- name: Upload coverage to Codecov
66-
uses: codecov/codecov-action@v4
64+
uses: codecov/codecov-action@v5
6765
with:
6866
name: coverage
6967
verbose: true

.gitignore

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ __pycache__/
2020
docs/_build
2121
!docs/api/.md
2222
docs/**/generated
23+
docs/_static/datasets_data.js
2324

2425
# IDEs
2526
/.idea/
@@ -45,10 +46,18 @@ spatialdata-sandbox
4546
# version file
4647
_version.py
4748

48-
# other
49-
node_modules/
49+
# agents configurations
50+
.claude/settings.local.json
5051

52+
# benchmarking and profiling
5153
.asv/
54+
profile.speedscope.json
55+
56+
# other
57+
node_modules/
5258

5359
.mypy_cache
5460
.ruff_cache
61+
uv.lock
62+
pixi.lock
63+

.pre-commit-config.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,18 +9,18 @@ ci:
99
skip: []
1010
repos:
1111
- repo: https://github.com/rbubley/mirrors-prettier
12-
rev: v3.7.4
12+
rev: v3.8.3
1313
hooks:
1414
- id: prettier
1515
exclude: ^.github/workflows/test.yaml
1616
- repo: https://github.com/pre-commit/mirrors-mypy
17-
rev: v1.19.1
17+
rev: v2.0.0
1818
hooks:
1919
- id: mypy
2020
additional_dependencies: [numpy, types-requests]
2121
exclude: tests/|docs/
2222
- repo: https://github.com/astral-sh/ruff-pre-commit
23-
rev: v0.14.10
23+
rev: v0.15.12
2424
hooks:
2525
- id: ruff
2626
args: [--fix, --exit-non-zero-on-fix]

.readthedocs.yaml

Lines changed: 15 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,22 @@
11
# https://docs.readthedocs.io/en/stable/config-file/v2.html
22
version: 2
33
build:
4-
os: ubuntu-20.04
4+
os: ubuntu-24.04
55
tools:
6-
python: "3.11"
7-
sphinx:
8-
configuration: docs/conf.py
9-
fail_on_warning: true
10-
python:
11-
install:
12-
- method: pip
13-
path: .
14-
extra_requirements:
15-
- docs
16-
- torch
6+
python: "3.13"
7+
jobs:
8+
post_checkout:
9+
# unshallow so version can be derived from tag
10+
- git fetch --unshallow || true
11+
create_environment:
12+
- asdf plugin add uv
13+
- asdf install uv latest
14+
- asdf global uv latest
15+
build:
16+
html:
17+
- uv sync --group=docs --extra=torch
18+
- uv run make --directory=docs html
19+
- mv docs/_build $READTHEDOCS_OUTPUT
1720
submodules:
1821
include:
1922
- "docs/tutorials/notebooks"

benchmarks/README.md

Lines changed: 49 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -14,25 +14,63 @@ pip install -e '.[docs,test,benchmark]'
1414

1515
## Usage
1616

17-
Running all the benchmarks is usually not needed. You run the benchmark using `asv run`. See the [asv documentation](https://asv.readthedocs.io/en/stable/commands.html#asv-run) for interesting arguments, like selecting the benchmarks you're interested in by providing a regex pattern `-b` or `--bench` that links to a function or class method e.g. the option `-b timeraw_import_inspect` selects the function `timeraw_import_inspect` in `benchmarks/spatialdata_benchmark.py`. You can run the benchmark in your current environment with `--python=same`. Some example benchmarks:
17+
Running all the benchmarks is usually not needed. You run the benchmark using `asv run`. See the [asv documentation](https://asv.readthedocs.io/en/stable/commands.html#asv-run) for interesting arguments, like selecting the benchmarks you're interested in by providing a regex pattern `-b` or `--bench` that links to a function or class method. You can run the benchmark in your current environment with `--python=same`. Some example benchmarks:
1818

19-
Importing the SpatialData library can take around 4 seconds:
19+
### Import time benchmarks
20+
21+
Import benchmarks live in `benchmarks/benchmark_imports.py`. Each `timeraw_*` function returns a Python code snippet that asv runs in a fresh interpreter (cold import, empty module cache):
22+
23+
Run all import benchmarks in your current environment:
2024

2125
```
22-
PYTHONWARNINGS="ignore" asv run --python=same --show-stderr -b timeraw_import_inspect
23-
Couldn't load asv.plugins._mamba_helpers because
24-
No module named 'conda'
25-
· Discovering benchmarks
26-
· Running 1 total benchmarks (1 commits * 1 environments * 1 benchmarks)
27-
[ 0.00%] ·· Benchmarking existing-py_opt_homebrew_Caskroom_mambaforge_base_envs_spatialdata2_bin_python3.12
28-
[50.00%] ··· Running (spatialdata_benchmark.timeraw_import_inspect--).
29-
[100.00%] ··· spatialdata_benchmark.timeraw_import_inspect 3.65±0.2s
26+
asv run --python=same --show-stderr -b timeraw
27+
```
28+
29+
Or a single one:
30+
31+
```
32+
asv run --python=same --show-stderr -b timeraw_import_spatialdata
33+
```
34+
35+
### Comparing the current branch against `main`
36+
37+
The simplest way is `asv continuous`, which builds both commits, runs the benchmarks, and prints the comparison in one shot:
38+
39+
```bash
40+
asv continuous --show-stderr -v -b timeraw main faster-import
3041
```
3142

43+
Replace `faster-import` with any branch name or commit hash. The `-v` flag prints per-sample timings; drop it for a shorter summary.
44+
45+
Alternatively, collect results separately and compare afterwards:
46+
47+
```bash
48+
# 1. Collect results for the tip of main and the tip of your branch
49+
asv run --show-stderr -b timeraw main
50+
asv run --show-stderr -b timeraw HEAD
51+
52+
# 2. Print a side-by-side comparison
53+
asv compare main HEAD
54+
```
55+
56+
Both approaches build isolated environments from scratch. If you prefer to skip the rebuild and reuse your current environment (faster, less accurate):
57+
58+
```bash
59+
asv run --python=same --show-stderr -b timeraw HEAD
60+
61+
git stash && git checkout main
62+
asv run --python=same --show-stderr -b timeraw HEAD
63+
git checkout - && git stash pop
64+
65+
asv compare main HEAD
66+
```
67+
68+
### Querying benchmarks
69+
3270
Querying using a bounding box without a spatial index is highly impacted by large amounts of points (transcripts), more than table rows (cells).
3371

3472
```
35-
$ PYTHONWARNINGS="ignore" asv run --python=same --show-stderr -b time_query_bounding_box
73+
$ asv run --python=same --show-stderr -b time_query_bounding_box
3674
3775
[100.00%] ··· ======== ============ ============= ============= ==============
3876
-- filter_table / n_transcripts_per_cell

0 commit comments

Comments
 (0)