Skip to content

Commit 3e6f236

Browse files
committed
Polish: British English, add Python 3.10 to CI, metadata, remove dead fixture
1 parent 1ccb226 commit 3e6f236

7 files changed

Lines changed: 17 additions & 35 deletions

File tree

.github/workflows/ci.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ jobs:
1111
runs-on: ubuntu-latest
1212
strategy:
1313
matrix:
14-
python-version: ["3.11", "3.12"]
14+
python-version: ["3.10", "3.11", "3.12"]
1515

1616
steps:
1717
- uses: actions/checkout@v4

CITATION.cff

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,3 +13,5 @@ keywords:
1313
authors:
1414
- family-names: "Kahraman"
1515
given-names: "Ekin"
16+
affiliation: "University of East Anglia"
17+
email: "evk23umu@uea.ac.uk"

README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Single-Cell RNA-seq Immune Cell Profiling
22

3-
End-to-end single-cell RNA-seq analysis pipeline in Python using [scanpy](https://scanpy.readthedocs.io/). Demonstrates quality control, normalization, dimensionality reduction, clustering with automated resolution selection, and marker-based cell type annotation on human PBMC data.
3+
End-to-end single-cell RNA-seq analysis pipeline in Python using [scanpy](https://scanpy.readthedocs.io/). Demonstrates quality control, normalisation, dimensionality reduction, clustering with automated resolution selection, and marker-based cell type annotation on human PBMC data.
44

55
<p align="center">
66
<img src="docs/umap_3d_rotation.gif" alt="3D UMAP rotation showing PBMC immune cell clusters" width="600">
@@ -24,8 +24,8 @@ End-to-end single-cell RNA-seq analysis pipeline in Python using [scanpy](https:
2424
| Step | Script | Description |
2525
|------|--------|-------------|
2626
| 01 | `scripts/01_load_and_qc.py` | Download data, calculate QC metrics (genes/cell, counts, mito %), filter |
27-
| 02 | `scripts/02_preprocess.py` | Normalize (10k/cell), log-transform, select 2,000 HVGs, regress covariates, scale |
28-
| 03 | `scripts/03_reduce_dimensions.py` | PCA (40 components), neighbor graph, UMAP embedding |
27+
| 02 | `scripts/02_preprocess.py` | Normalise (10k/cell), log-transform, select 2,000 HVGs, regress covariates, scale |
28+
| 03 | `scripts/03_reduce_dimensions.py` | PCA (40 components), neighbour graph, UMAP embedding |
2929
| 04 | `scripts/04_cluster.py` | Leiden clustering at 5 resolutions, silhouette-based selection (min 5 clusters) |
3030
| 05 | `scripts/05_annotate_cell_types.py` | Wilcoxon DE, marker gene scoring, automated cell type assignment |
3131
| 06 | `scripts/06_publication_figures.py` | Multi-panel publication figure (UMAP, composition, heatmap) |
@@ -93,9 +93,9 @@ pytest -v
9393

9494
- **Automated cell type annotation**: Clusters are assigned to cell types by scoring against curated PBMC marker gene sets, not manual inspection.
9595
- **Multi-resolution clustering**: Leiden is run at 5 resolutions (0.3-1.2) and the best is selected by silhouette score with a biological floor of 5 clusters.
96-
- **Colorblind-friendly palette**: Publication figures use the Okabe-Ito palette for accessibility.
96+
- **Colourblind-friendly palette**: Publication figures use the Okabe-Ito palette for accessibility.
9797
- **Modular scripts**: Each step reads the previous step's output from disk. Steps can be re-run independently.
9898

99-
## License
99+
## Licence
100100

101101
MIT

pyproject.toml

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,11 @@ description = "Single-cell RNA-seq immune cell profiling pipeline using scanpy"
99
readme = "README.md"
1010
license = {text = "MIT"}
1111
requires-python = ">=3.10"
12-
authors = [{name = "Ekin Kahraman"}]
12+
authors = [{name = "Ekin Kahraman", email = "evk23umu@uea.ac.uk"}]
1313
keywords = ["bioinformatics", "single-cell", "RNA-seq", "scanpy", "immune profiling"]
14+
15+
[project.urls]
16+
Repository = "https://github.com/Ekin-Kahraman/single-cell-rnaseq-immune-profiling"
1417
classifiers = [
1518
"Development Status :: 4 - Beta",
1619
"Intended Audience :: Science/Research",

scripts/02_preprocess.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
"""Step 02: Normalize, find highly variable genes, and scale."""
1+
"""Step 02: Normalise, find highly variable genes, and scale."""
22

33
import scanpy as sc
44
import matplotlib.pyplot as plt
@@ -13,10 +13,10 @@
1313

1414

1515
def preprocess(adata):
16-
"""Normalize, log-transform, select HVGs, regress, and scale."""
16+
"""Normalise, log-transform, select HVGs, regress, and scale."""
1717
print(f"Input: {adata.n_obs} cells, {adata.n_vars} genes")
1818

19-
# Normalize to target_sum counts per cell
19+
# Normalise to target_sum counts per cell
2020
sc.pp.normalize_total(adata, target_sum=TARGET_SUM)
2121
print(f"Normalized to {TARGET_SUM:.0f} counts per cell")
2222

scripts/06_publication_figures.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
RESULTS_DIR = Path("results")
1010
FIG_DIR = RESULTS_DIR / "figures"
1111

12-
# Color palette (colorblind-friendly)
12+
# Colour palette (colourblind-friendly)
1313
PALETTE = {
1414
"CD4+ T cells": "#E69F00",
1515
"CD8+ T cells": "#56B4E9",
@@ -82,7 +82,7 @@ def make_figure(adata):
8282
raw_df["cell_type"] = adata.obs["cell_type"].values
8383
mean_expr = raw_df.groupby("cell_type")[markers_present].mean()
8484

85-
# Z-score per gene for visualization
85+
# Z-score per gene for visualisation
8686
from scipy.stats import zscore
8787
z_expr = mean_expr.apply(zscore, axis=0)
8888

tests/conftest.py

Lines changed: 0 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22

33
import pytest
44
import scanpy as sc
5-
import numpy as np
65

76

87
@pytest.fixture(scope="session")
@@ -11,25 +10,3 @@ def pbmc3k_raw():
1110
adata = sc.datasets.pbmc3k()
1211
adata.var_names_make_unique()
1312
return adata
14-
15-
16-
@pytest.fixture
17-
def small_adata():
18-
"""Create a small synthetic AnnData for fast unit tests."""
19-
rng = np.random.default_rng(42)
20-
n_cells, n_genes = 200, 500
21-
X = rng.poisson(1, size=(n_cells, n_genes)).astype(np.float32)
22-
23-
adata = sc.AnnData(X)
24-
adata.var_names = [f"Gene_{i}" for i in range(n_genes)]
25-
adata.obs_names = [f"Cell_{i}" for i in range(n_cells)]
26-
27-
# Add some "mitochondrial" genes
28-
mt_genes = [f"MT-{c}" for c in ["ND1", "ND2", "CO1", "CO2", "ATP6"]]
29-
for i, name in enumerate(mt_genes):
30-
if i < n_genes:
31-
adata.var_names = adata.var_names.tolist()
32-
adata.var_names = [name if j == i else adata.var_names[j] for j in range(n_genes)]
33-
34-
adata.var_names_make_unique()
35-
return adata

0 commit comments

Comments
 (0)