Skip to content

Commit 38e77af

Browse files
katoshclaude
andcommitted
docs: add 3D tensor scaling section to PR description
Explains factored storage pattern for large tensors: register_section for compact rank-R factors + register_anndata_namespace for on-demand tensor reconstruction and O(rank) point queries. Includes compression ratio (1.5M× for 1M cells) and sparse.COO note. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 68777b5 commit 38e77af

1 file changed

Lines changed: 61 additions & 1 deletion

File tree

PR4_DESCRIPTION.md

Lines changed: 61 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -191,6 +191,66 @@ True
191191
| `repr(adata)` | Shows when non-empty |
192192
| View copy-on-write | Writing to a view triggers copy |
193193

194+
### Scaling 3D tensors: factored storage + accessor
195+
196+
A dense `(n_obs × n_obs × n_vars)` tensor is infeasible for large datasets (1M cells × 1M cells × 30K genes ≈ 10^16 entries). The practical pattern is to store compact rank-R factors and reconstruct on demand:
197+
198+
```python
199+
# Register factor storage (tiny: n_obs × rank and n_vars × rank)
200+
@register_section("comm_obs", alignment="obs")
201+
class CommObs:
202+
pass
203+
204+
@register_section("comm_var", alignment="var")
205+
class CommVar:
206+
pass
207+
208+
# Register accessor for tensor reconstruction
209+
@register_anndata_namespace("comm")
210+
class CellCommAccessor:
211+
def __init__(self, adata: ad.AnnData):
212+
self._adata = adata
213+
214+
def tensor(self, key="default"):
215+
"""Reconstruct (obs × obs × var) tensor from factors."""
216+
U = self._adata.comm_obs[key] # (n_obs, rank)
217+
V = self._adata.comm_var[key] # (n_vars, rank)
218+
return np.einsum("ir,jr,kr->ijk", U, U, V)
219+
220+
def query(self, sender, receiver, gene, key="default"):
221+
"""O(rank) point query without materializing tensor."""
222+
U = self._adata.comm_obs[key]
223+
V = self._adata.comm_var[key]
224+
i = self._adata.obs_names.get_loc(sender)
225+
j = self._adata.obs_names.get_loc(receiver)
226+
k = self._adata.var_names.get_loc(gene)
227+
return float(U[i] @ (U[j] * V[k]))
228+
```
229+
230+
```python
231+
>>> adata.comm_obs["lr"] = np.random.rand(100, 10) # factors: 12 KB
232+
>>> adata.comm_var["lr"] = np.random.rand(50, 10)
233+
234+
>>> adata.comm.tensor("lr").shape # dense tensor: 4 MB
235+
(100, 100, 50)
236+
237+
>>> adata.comm.query("cell_0", "cell_1", "CD8A", "lr") # O(rank), no tensor
238+
0.7386
239+
240+
>>> t_cells = adata[adata.obs["cell_type"] == "T"]
241+
>>> t_cells.comm.tensor("lr").shape # factors were subsetted
242+
(50, 50, 50)
243+
244+
>>> adata.write("test.h5ad") # only factors written
245+
>>> adata2 = ad.read_h5ad("test.h5ad")
246+
>>> adata2.comm.tensor("lr").shape # reconstructs from factors
247+
(100, 100, 50)
248+
```
249+
250+
This combines `@register_section` (factor storage with automatic subsetting and IO) with `@register_anndata_namespace` (tensor API and point queries). For 1M cells with rank 20, the factors are ~160 MB while the dense tensor would be ~240 TB — a 1,500,000× compression.
251+
252+
For moderately-sized datasets, `sparse.COO` from the PyData sparse package also works directly in registered sections (subsetting handles N-D sparse arrays).
253+
194254
### Also in this PR
195255

196256
- **`@register_anndata_namespace`** — custom accessor APIs (`adata.spatial.images`)
@@ -199,7 +259,7 @@ True
199259

200260
### Test coverage
201261

202-
67 tests covering all alignment patterns, custom validation, custom IO (JSON, xarray), 3D tensor subsetting, copy-on-write, and end-to-end workflows for TreeData-like, SpatialData-like, CellChat-like, and SCENIC-like scenarios.
262+
73 tests covering all alignment patterns, custom validation, custom IO (JSON, xarray), 3D tensor subsetting, factored tensor with accessor, copy-on-write, and end-to-end workflows for TreeData-like, SpatialData-like, CellChat-like, SCENIC-like, and factored communication scenarios.
203263

204264
### Future direction
205265

0 commit comments

Comments
 (0)