Skip to content

Commit a09c405

Browse files
committed
Updates tp day1-1b
1 parent 059f600 commit a09c405

6 files changed

Lines changed: 92 additions & 408 deletions

File tree

_freeze/day1/day1-1a_SpatialExperiment/execute-results/html.json

Lines changed: 2 additions & 2 deletions
Large diffs are not rendered by default.

day1/day1-1a_SpatialExperiment.qmd

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,8 @@ library(HDF5Array)
5555

5656
We will work on a **Visium HD** dataset from a human colorectal cancer study by [Oliveira _et al_](https://www.nature.com/articles/s41588-025-02193-3). The dataset contains normal adjacent tissue (NAT) and colorectal carcinoma (CRC) from 5 patients.
5757

58-
We will focus on a Visium HD slide from one patient (called "P2CRC"), available from the [10X website](https://www.10xgenomics.com/datasets/visium-hd-cytassist-gene-expression-libraries-of-human-crc). We will focus on the "binned output" available as an output of the Space Ranger pipeline (version 3.0). The full output is very large, so for practical reasons we use a region of interest, selected to include an interesting part of the tissue.:
58+
We will focus on a single Visium HD slide from a patient called "P2", available from the [10X website](https://www.10xgenomics.com/datasets/visium-hd-cytassist-gene-expression-libraries-of-human-crc). We will focus on the "binned output" available as an output of the Space Ranger pipeline (version 3.0). The full output is very large, so for practical reasons we use a region of interest, selected to include an interesting part of the tissue, because it contains gland-like epithelial structures together with stromal or lower-density areas:
59+
5960

6061
```{r day1-1a-SpatialExperiment-4}
6162
roi_visium <- c(
@@ -106,7 +107,6 @@ if (!dir.exists("data/Human_Colon_Cancer_P2/")) {
106107

107108
We will import the 16 um binned Visium HD output into a `SpatialExperiment` object and then subset it to the course region. The coordinates are in the native Visium HD image coordinate system returned by `spatialCoords()`.
108109

109-
110110
```{r day1-1a-SpatialExperiment-6}
111111
spe <- TENxVisiumHD(
112112
spacerangerOut = "data/Human_Colon_Cancer_P2/",
@@ -210,7 +210,7 @@ rowData(spe) |> head()
210210
nrow(spe)
211211
```
212212

213-
The assay stores the count matrix:
213+
The assay stores the UMI counts matrix:
214214

215215
```{r day1-1a-SpatialExperiment-10}
216216
assay(spe)
@@ -246,7 +246,7 @@ imgRaster(spe) |> plot()
246246

247247
## Visualizing the tissue region
248248

249-
To get a visual overview, we use `ggspavis::plotVisium`. First we plot the tissue image without bins:
249+
To get a visual overview, we use `plotVisium()` function from the `ggspavis` package. First we plot the tissue image without bins:
250250

251251
```{r day1-1a-SpatialExperiment-15}
252252
plotVisium(spe, spots = FALSE)
@@ -270,7 +270,7 @@ Why do we see only a specific region of the slide?
270270
The full Visium HD output is much larger than needed for this introductory exercise. At the start of this exercise, we subsetted the object to one representative tissue region so that it remains fast to plot and process. This same region is also used as the anchor for selecting a comparable Xenium region in `Exercise 1 B`.
271271
:::
272272

273-
It is possible to colour bins according to a gene or a column in `colData`, for example `PIGR`:
273+
It is possible to colour bins according to a gene or a column in `colData`, for example `PIGR`. We renamed the rows to gene symbols after import, which is why we can refer to this marker as `"PIGR"` rather than by its Ensembl ID. At this stage we plot the number of UMIs since log-normalized counts will be introduced later in the course.
274274

275275
```{r day1-1a-SpatialExperiment-17}
276276
#| warning: false
@@ -339,7 +339,7 @@ We save the filtered `SpatialExperiment` object for the next steps. Remember tha
339339
```{r day1-1a-SpatialExperiment-21}
340340
dir.create("results/day1", showWarnings = FALSE, recursive = TRUE)
341341
saveHDF5SummarizedExperiment(spe,
342-
dir = "results/day1", prefix = "01.1_spe", replace = TRUE,
342+
dir = "results/day1", prefix = "01.1_spe_", replace = TRUE,
343343
chunkdim = NULL, level = NULL, as.sparse = NA,
344344
verbose = NA
345345
)

day1/day1-1b_SpatialFeatureExperiment.qmd

Lines changed: 45 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -30,17 +30,17 @@ library(Voyager)
3030
library(ggplot2)
3131
library(patchwork)
3232
library(scuttle)
33+
library(HDF5Array)
3334
```
3435

3536
## Data for the course
3637

37-
We will work with the **Xenium In Situ, sample P2 CRC** dataset from the same human colorectal cancer study used in `Exercise 1 A`. The Xenium section is a serial section of the Visium HD P2 CRC sample.
38+
We will work on a **Xenium** dataset from a human colorectal cancer study by [Oliveira _et al_](https://www.nature.com/articles/s41588-025-02193-3). The dataset contains normal adjacent tissue (NAT) and colorectal carcinoma (CRC) from 5 patients.
3839

39-
This means the two datasets are biologically related but not the exact same physical tissue section. We therefore subset Xenium to an **approximately matching tissue region**, guided by the Visium HD region and by morphology. This makes the two exercises comparable while avoiding a full image-registration workflow during the practical.
40+
We will focus on a single Xenium slide from a patient called "P2", available from the [10X website](https://www.10xgenomics.com/platforms/visium/product-family/dataset-human-crc). This slide is a serial section of the sample used in `Exercise 1 A`, so the two datasets are biologically related but not the exact same physical tissue section.
4041

41-
The full Xenium output is large and includes cell boundaries, nucleus boundaries, transcript locations, morphology images, and expression matrices. For this exercise, we use a compact course data folder that contains the required Xenium output files.
42+
The full Xenium output is very large and includes cell boundaries, nucleus boundaries, transcript locations, morphology images, and expression matrices. For this exercise, for practical reasons we use a region of interest, selected to **approximately match the region of interest used in `Exercise 1 A`**, and only include some of the layers in the imported compact course folder. This makes the two exercises comparable while avoiding a full image-registration workflow during the practical. This region was chosen from the overview plot because it contains gland-like epithelial structures together with stromal or lower-density areas.
4243

43-
The selected Xenium region is:
4444

4545
```{r day1-1b-SFE-2}
4646
roi_xenium <- c(
@@ -52,9 +52,7 @@ roi_xenium <- c(
5252
roi_xenium
5353
```
5454

55-
We import the compact Xenium output with `flip = "none"` so the exercise uses the native Xenium coordinate orientation. The same visual region may appear with negative y coordinates if the geometry has been flipped during import with morphology images. Because the compact course folder does not include morphology images, we use the positive y coordinates shown above.
56-
57-
This region was chosen from the overview plot because it contains gland-like epithelial structures together with stromal or lower-density areas.
55+
We import the Xenium output with `flip = "none"` so the exercise uses the native Xenium coordinate orientation. The same visual region may appear with negative y coordinates if the geometry has been flipped during import with morphology images. Because the compact course folder does not include morphology images, we use the positive y coordinates shown above.
5856

5957
```{r day1-1b-SFE-3}
6058
options(timeout = 600)
@@ -83,10 +81,6 @@ We import the compact Xenium output into a `SpatialFeatureExperiment` object:
8381
```{r day1-1b-SFE-4}
8482
sfe_full <- readXenium("data/Human_Colon_Cancer_P2/xenium/outs", flip = "none")
8583
86-
if (ncol(sfe_full) == 0) {
87-
stop("The Xenium object has 0 cells before ROI subsetting. Check that cell_feature_matrix.h5 is present in data/Human_Colon_Cancer_P2/xenium/outs/.")
88-
}
89-
9084
rownames(sfe_full) <- uniquifyFeatureNames(
9185
ID = rowData(sfe_full)$ID,
9286
names = rowData(sfe_full)$Symbol
@@ -97,27 +91,23 @@ range(spatialCoords(sfe_full)[, 1])
9791
range(spatialCoords(sfe_full)[, 2])
9892
```
9993

100-
Then we subset it to the course region:
94+
Then we subset it to the region of interest:
10195

10296
```{r day1-1b-SFE-5}
10397
sfe <- sfe_full[, spatialCoords(sfe_full)[, 1] >= roi_xenium[["xmin"]] &
10498
spatialCoords(sfe_full)[, 1] <= roi_xenium[["xmax"]] &
10599
spatialCoords(sfe_full)[, 2] >= roi_xenium[["ymin"]] &
106100
spatialCoords(sfe_full)[, 2] <= roi_xenium[["ymax"]]]
107101
108-
if (ncol(sfe) == 0) {
109-
stop("The Xenium ROI selected 0 cells. Check range(spatialCoords(sfe)) and revise roi_xenium.")
110-
}
111-
112102
sfe
113103
```
114104

115105
::: callout-important
116106
## Exercise 1
117107

118-
Why is a `SpatialFeatureExperiment` useful for Xenium data?
108+
Why is an object of the `SpatialFeatureExperiment` class used to store Xenium data?
119109

120-
Which information does Xenium provide that is not naturally represented by a simple spot- or bin-level `SpatialExperiment`?
110+
Which information could not be represented by a simple spot- or bin-level `SpatialExperiment`?
121111
:::
122112

123113
::: {.callout-tip collapse="true"}
@@ -130,7 +120,7 @@ Xenium is an image-based spatial transcriptomics technology. In addition to a co
130120

131121
## Exploring the object
132122

133-
`SpatialFeatureExperiment` extends `SpatialExperiment`, so familiar accessors still apply:
123+
`SpatialFeatureExperiment` extends `SpatialExperiment`, which extends `SingleCellExperiment`, so familiar accessors still apply:
134124

135125
```{r day1-1b-SFE-6}
136126
dim(sfe)
@@ -148,24 +138,33 @@ Questions:
148138

149139
- What do the columns represent?
150140
- What do the rows represent?
151-
- How is this different from the Visium HD `spe` object in Exercise 1 A?
141+
- What differs compared to the Visium HD `spe` object in `Exercise 1 A`?
142+
- What is stored in the main assay?
152143
:::
153144

154145
::: {.callout-tip collapse="true"}
155146
## Answer
156147

157-
In this object, columns represent Xenium cells, while rows represent measured genes/features. This differs from the Visium HD binned `SpatialExperiment`, where columns represent spatial bins rather than segmented cells.
148+
In this object, columns represent segmented cells, while rows represent genes. This differs from the Visium HD slide stored in a `SpatialExperiment` object, where columns represent spatial bins rather than segmented cells.
158149

159150
```{r day1-1b-SFE-7}
160-
dim(sfe)
161151
ncol(sfe)
162152
nrow(sfe)
163153
```
154+
155+
We notice that the number of features is small (541) as the Xenium technology measures a targeted panel rather than the whole transcriptome.
156+
157+
```{r day1-1b-SFE-7b}
158+
assay(sfe)
159+
```
160+
161+
The count matrix indicates the number of observed molecules for each of the 541 genes in each cell. This matrix is analogous to a UMI counts matrix in the Visium HD dataset.
162+
164163
:::
165164

166165
## Inspecting geometries
167166

168-
One of the key features of `SpatialFeatureExperiment` is that it can store geometries associated with cells or features.
167+
In addition to the slots from the `SpatialExperiment` object, one of the key features of `SpatialFeatureExperiment` is that it can store geometries associated with cells or features.
169168

170169
```{r day1-1b-SFE-8}
171170
colGeometries(sfe)
@@ -174,24 +173,24 @@ colGeometries(sfe)
174173
::: callout-important
175174
## Exercise 3
176175

177-
Inspect the available column geometries in the Xenium object.
178-
179-
Which geometries are available? What biological structures do they correspond to?
176+
Inspect the available column geometries in the Xenium object. To which biological structures do they correspond to? In which class are stored these objects?
180177
:::
181178

182179
::: {.callout-tip collapse="true"}
183180
## Answer
184181

185-
For Xenium data, the available geometries commonly include cell centroids, cell boundaries, and nucleus boundaries. The exact names depend on the reader and package versions.
182+
The available geometries include cell centroids, cell boundaries, and nucleus boundaries. These are are commonly available for imaging-based spatial transcriptomics technologies, but the exact names depend on the reader functions and packages used to import the data.
186183

187184
```{r day1-1b-SFE-9}
188-
colGeometries(sfe)
185+
colGeometries(sfe)[["centroids"]]
189186
```
187+
The geometries are stored using "Simple Features" objects, which allow to encode spatial data and can be easily manipulated with the [`sf` package](https://r-spatial.github.io/sf/index.html)
188+
190189
:::
191190

192191
## Visualizing Xenium cells
193192

194-
We can start with a simple coordinate plot of the selected Xenium region:
193+
We can start with a simple scatterplot of the cells in the selected region of interest:
195194

196195
```{r day1-1b-SFE-10}
197196
xy <- as.data.frame(spatialCoords(sfe))
@@ -207,7 +206,7 @@ ggplot(xy, aes(x = x, y = y)) +
207206
::: callout-important
208207
## Exercise 4
209208

210-
Compare this point-based visualization with the Visium HD plot from Exercise 1 A.
209+
Compare this point-based visualization with the Visium HD plot from `Exercise 1 A`.
211210

212211
What does one point represent in each dataset?
213212
:::
@@ -218,9 +217,7 @@ What does one point represent in each dataset?
218217
In the Visium HD exercise, each point or square represents a 16 um spatial bin. In the Xenium exercise, each point represents a segmented cell centroid. This is one of the key conceptual differences between binned sequencing-based spatial data and image-based cell-resolved spatial data.
219218
:::
220219

221-
We can also visualize expression for a marker gene. For example, `PIGR` is present in the Xenium panel and shows a clear spatial pattern in this region. We renamed the rows to gene symbols after import, which is why we can refer to this marker as `"PIGR"` rather than by its Ensembl ID.
222-
223-
At this stage of the course we plot `raw counts`. `Log-normalized` values will be introduced later, in the normalization part of the course.
220+
We can also visualize expression for a marker gene. For example, `PIGR` is present in the Xenium panel and shows a clear spatial pattern in this region. We renamed the rows to gene symbols after import, which is why we can refer to this marker as `"PIGR"` rather than by its Ensembl ID. At this stage we plot the number of molecules since log-normalized counts will be introduced later in the course. The `plotSpatialFeature()` function from the `Voyager` package is handy:
224221

225222
```{r day1-1b-SFE-11}
226223
plotSpatialFeature(sfe, "PIGR", exprs_values = "counts")
@@ -240,47 +237,34 @@ How does the interpretation differ from plotting a gene on Visium HD bins?
240237
The Xenium signal is measured over segmented cells, while the Visium HD binned signal aggregates molecules within fixed spatial bins. Xenium can therefore represent cell-level heterogeneity more directly, but it measures a targeted panel rather than the whole transcriptome.
241238
:::
242239

243-
## Subsetting with coordinates
244-
245-
At the start of the exercise, we subsetted the full imported Xenium object to a course region. We can still apply coordinate-based subsetting again in the same spirit:
246-
247-
```{r day1-1b-SFE-12}
248-
xy <- spatialCoords(sfe)
249-
250-
roi <- c(
251-
xmin = unname(quantile(xy[, 1], 0.25)),
252-
xmax = unname(quantile(xy[, 1], 0.75)),
253-
ymin = unname(quantile(xy[, 2], 0.25)),
254-
ymax = unname(quantile(xy[, 2], 0.75))
255-
)
256-
257-
keep <- xy[, 1] > roi[["xmin"]] & xy[, 1] < roi[["xmax"]] &
258-
xy[, 2] > roi[["ymin"]] & xy[, 2] < roi[["ymax"]]
259-
260-
sfe_sub <- sfe[, keep]
261-
dim(sfe)
262-
dim(sfe_sub)
263-
```
264-
265-
::: callout-important
266240
## Exercise 6
267241

268-
Why did we subset Xenium at the start of the exercise instead of using the full public output directly throughout the practical?
269-
:::
242+
Use the `plotSpatialFeature()` function to visualize the cell segmentation mask, colored by cell area, and the nuclei segmentation mask, colored by nucleus area
270243

271244
::: {.callout-tip collapse="true"}
272245
## Answer
273246

274-
The full Xenium output is large and contains many cells, boundaries, transcript coordinates, and morphology images. A course-sized subset keeps the exercise responsive while preserving the structure of a real Xenium object.
247+
```{r day1-1b-SFE-12}
248+
plotSpatialFeature(sfe,
249+
colGeometryName="cellSeg",
250+
features="cell_area")
275251
276-
The subset is also chosen to approximately match the Visium HD P2 CRC region from Exercise 1 A. Because Visium HD and Xenium are serial sections, this is a morphology-matched comparison rather than an exact same-cell comparison.
252+
plotSpatialFeature(sfe,
253+
colGeometryName="nucSeg",
254+
features="nucleus_area")
255+
```
277256
:::
278257

279258
## Save the object
280259

281260
```{r day1-1b-SFE-13}
282261
dir.create("results/day1", showWarnings = FALSE, recursive = TRUE)
283-
saveRDS(sfe, "results/day1/01.1b_sfe_p2.rds")
262+
263+
saveHDF5SummarizedExperiment(sfe,
264+
dir = "results/day1", prefix = "01.1b_sfe_", replace = TRUE,
265+
chunkdim = NULL, level = NULL, as.sparse = NA,
266+
verbose = NA
267+
)
284268
```
285269

286270
Clear your environment:

0 commit comments

Comments
 (0)