MERFISHEYES

Web-based 3D visualization platform for spatial transcriptomics data. Supports both single cell and single molecule datasets.

Features

Single Cell Visualization

Multiple format support: .h5ad (AnnData), MERSCOPE, and Xenium
3D visualization of cell-level spatial data using Three.js
Color cells by gene expression or cell type annotations
Combined gene + celltype visualization: Show gene expression gradient on selected celltypes only
Numerical metadata support: Continuous metadata (e.g., QC metrics) visualized with gradient coloring
Interactive legends panel (top-right): Live display of active selections
- Gene badge with one-click removal
- Embedded gradient scale bar (compact w-8 h-32)
- Color-coded celltype badges with palette colors
- "Clear All" header to deselect all celltypes at once
- 70% opacity badges that become solid on hover
Interactive gene expression scalebar: Real-time visual scale with draggable min/max controls
- Auto-scales to 95th percentile when gene changes
- Manual override via vertical drag scrubbers
- Glassmorphism design with smooth debounced updates
- Works for both gene expression and numerical columns
Glassmorphism UI: Sidebar panels and controls with frosted glass effect
- Click-outside-to-close for selection panels
- Smooth rounded corners (rounded-3xl)
Interactive filtering and selection of cell populations
Automatic column type detection (categorical vs numerical)
Mutual exclusivity: Gene and numerical columns cannot be selected simultaneously

Single Molecule Visualization

Parquet and CSV file support for molecule coordinates
3D point cloud visualization with one cloud per gene
Lazy loading from S3 for efficient memory usage
Gene-based filtering and multi-gene overlay
Automatic control gene filtering: Removes negative controls, unassigned probes, and codewords
2D/3D view mode toggle

General

Web worker processing: Non-blocking background processing for all data parsing
Cloud storage with AWS S3 integration and lazy loading
Duplicate detection via dataset fingerprinting
Email notifications with shareable links and dataset metadata (cell count, gene count, platform)
Dark mode
Works on desktop and tablet

Tech Stack

Next.js 15 - React framework with Turbopack
HeroUI v2 - UI components
Three.js - 3D visualization
TypeScript - Type safety
Tailwind CSS - Styling
Zustand - State management
Prisma - Database ORM (PostgreSQL)
AWS S3 - Cloud file storage
h5wasm - HDF5/H5AD file reading
Hyparquet - Pure JavaScript parquet parsing
Comlink - Web worker communication
Pako - Gzip compression/decompression
PapaParse - CSV parsing
SendGrid - Email notifications

Getting Started

Requires Node.js 18+

Installation

git clone <repository-url>
cd merfisheyes-heroui
npm install

Environment Setup

Copy .env.example to .env.local and configure:

cp .env.example .env.local

Required environment variables:

DATABASE_URL - PostgreSQL connection string
AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION, AWS_S3_BUCKET - S3 storage
NEXT_PUBLIC_BASE_URL - Base URL for the application
SENDGRID_API_KEY, SENDGRID_FROM_EMAIL - Email notifications

See .env.example for full list and examples.

Database Setup

npx prisma generate    # Generate Prisma client
npx prisma migrate dev # Run database migrations

Development

npm run dev            # Start dev server (http://localhost:3000)

Production Build

npm run build          # Build for production (requires 4GB RAM)
npm start              # Start production server

Low memory servers: If you encounter SIGBUS errors on servers with limited RAM:

npm run build:low-memory  # Build with 2GB memory limit

Or manually set memory limit:

NODE_OPTIONS='--max-old-space-size=2048' npm run build

Deployment

For deploying to remote servers:

Build locally on a high-memory machine
Deploy using the included script:
```
./deploy.sh
```

The deploy script:

Builds the project locally
Transfers .next, public, package.json, and prisma to remote server
Runs npx prisma generate on production
Restarts the application with PM2

Note: The script does NOT transfer .env.local - production server should have its own environment variables configured.

Usage

Uploading Data

Single Cell Data

Navigate to /viewer and upload:

H5AD: Single .h5ad file
Xenium: Folder with cells.csv and related files
MERSCOPE: Folder with cell_metadata.csv and related files

Single Molecule Data

Navigate to /sm-viewer and upload:

Parquet: .parquet file with columns for gene names and x/y/z coordinates
CSV: .csv file with same column structure
Supports Xenium and MERSCOPE column naming conventions

Drag and drop or click to upload. After processing, you'll receive an email with a shareable link to view your dataset.

Viewer Controls

Single Cell Viewer (`/viewer/[id]`)

Loading Progress: Real-time progress bar showing dataset loading from S3 (0-100% with status messages)
Rotate: Left click + drag
Pan: Right click + drag or middle click + drag
Zoom: Mouse wheel
Hover: Mouse over points to see tooltip with cluster and gene information (shows original palette colors even when filtered)
Double-click: Double-click points to toggle cluster selection and switch to celltype mode
Panel Navigation: Click Celltype/Gene buttons to open panels without changing visualization
Filter: Use side panel to filter by cell type (categorical columns only)
Color: Select gene from dropdown to color by expression, or choose cluster column:
- Categorical columns (≤100 unique values): Discrete colors with checkbox filtering
- Numerical columns (>100 unique values): Coolwarm gradient, no filtering UI
Gene Expression Scalebar: Appears when gene or numerical column is active
- Gradient Bar: Blue (low) → White (mid) → Red (high)
- Number Scrubbers: Drag vertically to adjust min/max scale values
- Auto-scales to 0 and 95th percentile on gene/column change
- Manual adjustments persist until gene/column changes
Combined Mode: Select a gene + toggle celltypes to see gene expression only on those cell populations
- Automatically activates when both gene and celltypes are selected
- Non-selected celltypes appear grey
- Selected celltypes show gene expression gradient (coolwarm)
Automatic Mode Switching:
- Selecting a numerical column clears gene selection (mutual exclusivity)
- Selecting a gene while numerical column is active switches back to categorical column
Visualization updates only when actively selecting a gene or toggling a celltype

Single Molecule Viewer (`/sm-viewer/[id]`)

Auto-selection: First 3 genes are automatically selected and displayed when loading from S3 link
Rotate: Left click + drag (disabled in 2D mode, enabled with TrackballControls in 3D mode)
Pan: Right click + drag
Zoom: Mouse wheel
Select Genes: Search and check genes to display
- Toast notifications only appear when loading new genes, not when adjusting colors/sizes
2D/3D Toggle: Switch between top-down and perspective views
- Scene automatically reinitializes with appropriate controls (OrbitControls for 2D, TrackballControls for 3D)
Scale: Adjust point size with global and per-gene local scales

Data Format Requirements

Single Cell Formats

H5AD

Standard AnnData format
Requires obsm['X_spatial'] for coordinates
Optional: obsm['X_umap'], cell type annotations in obs

Xenium (Cell-level)

Required: cells.csv with centroids
Optional: transcripts.csv, features.tsv
Detects cell type columns automatically

MERSCOPE (Cell-level)

Required: cell_metadata.csv with coordinates
Optional: cell_categories.csv, cell_numeric_categories.csv, cell_by_gene.csv

Single Molecule Formats

Parquet

Columnar binary format (most efficient for large datasets)
Required columns (configurable):
- Gene name: feature_name (Xenium) or gene (MERSCOPE)
- X coordinate: x_location (Xenium) or global_x (MERSCOPE)
- Y coordinate: y_location (Xenium) or global_y (MERSCOPE)
- Z coordinate: z_location (Xenium) or global_z (MERSCOPE) - optional for 2D data

CSV

Text format with same column requirements as parquet
Automatically infers 2D vs 3D based on z column presence
Less memory-efficient than parquet for large datasets (millions of molecules)

API Routes

The application provides RESTful API endpoints for dataset upload and management:

Single Cell Endpoints

Route	Method	Purpose
`/api/datasets/check-duplicate/{fingerprint}`	GET	Check if dataset already exists
`/api/datasets/initiate`	POST	Start upload, get presigned S3 URLs
`/api/datasets/{datasetId}/complete`	POST	Mark upload as complete
`/api/datasets/{datasetId}`	GET	Get dataset info + download URLs

Single Molecule Endpoints

Route	Method	Purpose
`/api/single-molecule/check-duplicate/{fingerprint}`	GET	Check if single molecule dataset exists
`/api/single-molecule/initiate`	POST	Start upload, get presigned S3 URLs
`/api/single-molecule/{id}/files/{key}/complete`	POST	Mark individual file as uploaded
`/api/single-molecule/{id}/complete`	POST	Finalize upload, send email
`/api/single-molecule/{id}`	GET	Get dataset metadata and manifest URL
`/api/single-molecule/{id}/gene/{geneName}`	GET	Get presigned URL for specific gene file

Upload Flow

Single Cell Upload

Check for duplicates - GET /api/datasets/check-duplicate/{fingerprint}
Initiate upload - POST /api/datasets/initiate with metadata and file list
- Creates database records (Dataset, UploadSession, UploadFile)
- Returns presigned S3 URLs for file upload
Upload files - Use presigned URLs to upload directly to S3
Complete upload - POST /api/datasets/{datasetId}/complete
- Finalizes the upload session, sends email notification

Single Molecule Upload

Process locally - Client processes dataset into manifest + gene files
Check for duplicates - GET /api/single-molecule/check-duplicate/{fingerprint}
Initiate upload - POST /api/single-molecule/initiate
- Returns presigned S3 URLs for manifest and all gene files
Upload files - Upload manifest.json.gz and genes/{gene}.bin.gz files to S3
Mark files complete - POST /api/single-molecule/{id}/files/{key}/complete for each file
Complete upload - POST /api/single-molecule/{id}/complete
- Sends email with link to /sm-viewer/{id}

Project Structure

├── app/                           # Next.js app directory
│   ├── api/                      # API routes
│   │   ├── datasets/             # Single cell endpoints
│   │   ├── single-molecule/      # Single molecule endpoints
│   │   └── send-email*/          # Email notification services
│   ├── viewer/                   # Single cell viewer
│   │   └── [id]/                 # Dynamic dataset routes
│   ├── sm-viewer/                # Single molecule viewer
│   │   └── [id]/                 # Dynamic dataset routes with S3 lazy loading
│   ├── explore/                  # Example datasets page
│   └── about/                    # About page
├── components/                   # React components
│   ├── three-scene.tsx           # Single cell Three.js scene
│   ├── single-molecule-three-scene.tsx  # Single molecule Three.js scene
│   ├── visualization-controls.tsx       # Single cell controls
│   ├── single-molecule-controls.tsx     # Single molecule controls
│   ├── gene-scalebar.tsx         # Interactive gene expression scalebar
│   ├── ui/
│   │   └── number-scrubber.tsx   # Draggable number input component
│   └── file-upload.tsx           # Unified upload component
├── lib/                          # Core logic
│   ├── adapters/                # Single cell format adapters
│   │   ├── H5adAdapter.ts
│   │   ├── XeniumAdapter.ts
│   │   ├── MerscopeAdapter.ts
│   │   └── ChunkedDataAdapter.ts  # S3 loading adapter
│   ├── config/                  # Configuration files
│   │   ├── visualization.config.ts  # Centralized visualization parameters
│   │   └── moleculeColumnMappings.ts  # Column naming conventions
│   ├── workers/                 # Web workers for background processing
│   │   ├── standardized-dataset.worker.ts  # Single cell parsing worker (H5AD/Xenium/MERSCOPE)
│   │   ├── standardizedDatasetWorkerManager.ts  # Single cell worker manager
│   │   ├── single-molecule.worker.ts  # Parquet/CSV parsing worker
│   │   └── singleMoleculeWorkerManager.ts  # Single molecule worker manager
│   ├── stores/                  # Zustand state stores
│   │   ├── datasetStore.ts      # Single cell datasets
│   │   ├── singleMoleculeStore.ts  # Single molecule datasets
│   │   ├── visualizationStore.ts   # Single cell viz state
│   │   └── singleMoleculeVisualizationStore.ts  # Single molecule viz state
│   ├── services/                # Data processing services
│   │   └── hyparquetService.ts  # Hyparquet parquet reader
│   ├── utils/
│   │   ├── SingleMoleculeProcessor.ts  # S3 upload processing
│   │   ├── fingerprint.ts       # Dataset fingerprinting
│   │   ├── gene-filters.ts      # Shared gene filtering (control probes, etc.)
│   │   └── color-palette.ts     # Centralized color palette (40+ colors)
│   ├── webgl/                   # WebGL/Three.js utilities (single cell)
│   ├── s3.ts                    # S3 client utilities
│   ├── prisma.ts                # Database client
│   ├── StandardizedDataset.ts   # Single cell dataset class
│   └── SingleMoleculeDataset.ts # Single molecule dataset class (lazy loading)
├── prisma/                      # Database schema
│   └── schema.prisma            # Supports both dataset types
└── public/                      # Static assets

Contributing

Pull requests welcome.

Acknowledgments

Built at the Bogdan Bintu Lab, UCSD.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 349 Commits
.github/workflows		.github/workflows
.vscode		.vscode
app		app
components		components
config		config
docs		docs
k8s		k8s
lib		lib
prisma		prisma
public		public
scripts		scripts
styles		styles
tests		tests
types		types
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.npmrc		.npmrc
CLAUDE.md		CLAUDE.md
DATA_PROCESSING_ARCHITECTURE.md		DATA_PROCESSING_ARCHITECTURE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
build-output.log		build-output.log
deploy.sh		deploy.sh
eslint.config.mjs		eslint.config.mjs
next.config.js		next.config.js
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
tailwind.config.js		tailwind.config.js
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Folders and files

Latest commit

History

Repository files navigation

MERFISHEYES

Features

Single Cell Visualization

Single Molecule Visualization

General

Tech Stack

Getting Started

Installation

Environment Setup

Database Setup

Development

Production Build

Deployment

Usage

Uploading Data

Single Cell Data

Single Molecule Data

Viewer Controls

Single Cell Viewer (/viewer/[id])

Single Molecule Viewer (/sm-viewer/[id])

Data Format Requirements

Single Cell Formats

H5AD

Xenium (Cell-level)

MERSCOPE (Cell-level)

Single Molecule Formats

Parquet

CSV

API Routes

Single Cell Endpoints

Single Molecule Endpoints

Upload Flow

Single Cell Upload

Single Molecule Upload

Project Structure

Contributing

Acknowledgments

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Single Cell Viewer (`/viewer/[id]`)

Single Molecule Viewer (`/sm-viewer/[id]`)

Packages