📝 Add quickstart notebook/tutorial #48

felix0097 · 2025-10-01T16:54:13Z

That's a draft for a quickstart tutorial notebook. The idea was to walk the user through all steps needed to use the package, eg. from creating the collection to actually using the dataloader + explain the important settings to the user.

Let me know what you think and whether I missed something

codecov · 2025-10-01T16:56:04Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.63%. Comparing base (e42d367) to head (b6b731f).

Additional details and impacted files

@@           Coverage Diff           @@
##             main      #48   +/-   ##
=======================================
  Coverage   83.63%   83.63%           
=======================================
  Files           8        8           
  Lines         605      605           
=======================================
  Hits          506      506           
  Misses         99       99

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ilan-gold

Generally looks very good

ilan-gold · 2025-10-02T09:10:47Z

docs/notebooks/example.ipynb

+    "        \"threading.max_workers\": 5,\n",
+    "        \"codec_pipeline.path\": \"zarrs.ZarrsCodecPipeline\",\n",
+    "        \"concurrency\": 4,\n",


Either don't set the max_workers and concurrency or make it os.cpu_count dependent. I just wouldn't set it tbh

Although if you're going to have this big section, then maybe you can explain the settings

I've just double checked this, for me it doesn't give that much of a speed increase. Would be fine with removing it based on this.

I'm also fine removing this yea

ilan-gold · 2025-10-02T09:11:22Z

docs/notebooks/example.ipynb

+    "# Download an example dataset from CELLxGENE\n",
+    "!wget https://datasets.cellxgene.cziscience.com/866d7d5e-436b-4dbd-b7c1-7696487d452e.h5ad"
+   ],


Maybe we should do two datasets? It's easy enough and highlights that we can handle the var_space

ilan-gold · 2025-10-02T09:11:40Z

docs/notebooks/example.ipynb

+    "        \"866d7d5e-436b-4dbd-b7c1-7696487d452e.h5ad\",\n",
+    "    ],\n",
+    "    # Path to store the output collection\n",
+    "    output_path=\"tahoe100_FULL\",\n",


A different name?

ilan-gold · 2025-10-02T09:14:18Z

docs/notebooks/example.ipynb

+   "metadata": {},
+   "cell_type": "markdown",
+   "source": [
+    "IMPORTANT:\n",


I think "IMPORTANT" should at least be bold

ilan-gold · 2025-10-02T09:17:19Z

docs/notebooks/example.ipynb

+    "* The `ZarrSparseDataset` yields batches of sparse tensors.\n",
+    "* The conversion to dense tensors should be done on the GPU, as shown in the example below.\n",
+    "  * First call `.cuda()` and then `.to_dense()`\n",
+    "  * E.g. `x = x.cuda().to_dense()`\n",
+    "  * This is significantly faster than doing the dense conversion on the CPU.\n"


Maybe mention preload_to_gpu here - i.e., if you have a GPU and can spare some extra memory, you should use preload_to_gpu and then you don't need to use .cuda().

I've added the preload_to_gpu option. Would leave the .cuda() call. That might just confuse the user, and if everything is already correct the .cuda() call doesn't do anything.

From the torch documentation:

"If this object is already in CUDA memory and on the correct device, then no copy is performed and the original object is returned."

"If this object is already in CUDA memory and on the correct device, then no copy is performed and the original object is returned."

Oh rad, then I would have been ok leaving out preload_to_gpu but now that this is moving in the direction of a guide not on the README.md but on the docs, the extra detail is good then

ilan-gold · 2025-10-02T09:18:55Z

docs/notebooks/example.ipynb

+   "source": [
+    "The conversion code will take care of the following things:\n",
+    "* Align the gene spaces across all datasets listed in `adata_paths`\n",
+    "  * The gene spaces are aligned based on the gene names provided in the `var_names` field of the individual `AnnData` objects.\n",


It's specifically an outer join, though - "aligned" is ambiguous

…rs into quickstart-tutorial

ilan-gold

With these changes, looks good. Probably want to make sure it renders right with the changes

docs/notebooks/example.ipynb

ilan-gold · 2025-10-06T13:25:40Z

docs/notebooks/example.ipynb

+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from arrayloaders import create_anndata_collection\n",


Update imports :)))

docs/notebooks/example.ipynb

Co-authored-by: Ilan Gold <[email protected]>

review-notebook-app · 2025-10-07T14:13:31Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

📝 Added quickstart notebook

fc0f5c5

felix0097 requested a review from ilan-gold October 1, 2025 16:54

ilan-gold requested changes Oct 2, 2025

View reviewed changes

ilan-gold and others added 3 commits October 2, 2025 19:45

Merge branch 'main' into quickstart-tutorial

815b7b8

🎨 Added suggestions from @ilan-gold

a003fa5

Merge branch 'quickstart-tutorial' of github.com:laminlabs/arrayloade…

b6b731f

…rs into quickstart-tutorial

felix0097 requested a review from ilan-gold October 6, 2025 10:42

ilan-gold approved these changes Oct 6, 2025

View reviewed changes

felix0097 and others added 10 commits October 6, 2025 15:49

Apply suggestion from @ilan-gold

a8afc72

Co-authored-by: Ilan Gold <[email protected]>

Apply suggestion from @ilan-gold

e472723

Co-authored-by: Ilan Gold <[email protected]>

Apply suggestion from @ilan-gold

acce4e9

Co-authored-by: Ilan Gold <[email protected]>

Apply suggestion from @ilan-gold

23cf14b

Co-authored-by: Ilan Gold <[email protected]>

Apply suggestion from @ilan-gold

c1df1fc

Co-authored-by: Ilan Gold <[email protected]>

Apply suggestion from @ilan-gold

ae20e02

Co-authored-by: Ilan Gold <[email protected]>

Apply suggestion from @ilan-gold

d4c5327

Co-authored-by: Ilan Gold <[email protected]>

Apply suggestion from @ilan-gold

f0ebccc

Co-authored-by: Ilan Gold <[email protected]>

Apply suggestion from @ilan-gold

2bc5031

Co-authored-by: Ilan Gold <[email protected]>

Apply suggestion from @ilan-gold

1e7d03c

Co-authored-by: Ilan Gold <[email protected]>

ilan-gold added 3 commits October 8, 2025 14:48

Merge branch 'main' into quickstart-tutorial

a662262

fix: small doc fixes

aca5bb0

chore: make notebook work with #62

ef2545c

ilan-gold force-pushed the quickstart-tutorial branch from 7e3d9ea to ef2545c Compare October 9, 2025 10:33

chore: clean rerun

59e2642

📝 Add quickstart notebook/tutorial #48

Are you sure you want to change the base?

📝 Add quickstart notebook/tutorial #48

Uh oh!

Conversation

felix0097 commented Oct 1, 2025

Uh oh!

codecov bot commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ilan-gold left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ilan-gold left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

review-notebook-app bot commented Oct 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Oct 1, 2025 •

edited

Loading