A caching proxy for fetching LaTeX packages from TexLive and CTAN. Extracts packages and serves them as JSON for browser-based LaTeX compilers.
packages/
├── ctan-core.ts # Shared extraction logic
├── ctan-proxy.ts # Production server (disk cache)
└── cache/ # Disk cache directory
serve-local.ts # Dev server (memory cache)
Shared extraction logic used by both servers:
LRUCache<T>— Bounded memory cache with evictionprocessZipData()— Extract ZIP archivesprocessRawFileData()— Handle single-file packagesprocessExtractedFiles()— Process TAR/ZIP contents into virtual filesystem
Development server on port 8787. Fetches packages from the local TexLive archive or CTAN mirrors. Memory cache only (cleared on restart).
Production server on port 8081. Adds disk caching (persists across restarts), request deduplication, and a file index for fast reverse lookups.
bun packages/ctan-proxy.tsThe proxy runs on http://localhost:8081 by default.
When a package is requested:
- Check cache - Memory first, then disk
- Try TexLive - Pre-built packages from TexLive 2025 archives
- Try CTAN - Falls back to CTAN mirrors if not in TexLive
- Extract & cache - Extracts
.sty,.cls, fonts, etc. and caches to disk
Packages are cached permanently to disk. The memory cache is a bounded LRU cache to reduce disk reads.
Request → Memory Cache (LRU) → Disk Cache (permanent) → TexLive/CTAN
CTAN is only contacted once per package, ever. Subsequent requests are served from disk.
Download and extract a package. Returns JSON with file contents.
curl http://localhost:8081/api/fetch/enumitemResponse:
{
"name": "enumitem",
"files": {
"/texlive/texmf-dist/tex/latex/enumitem/enumitem.sty": {
"path": "/texlive/texmf-dist/tex/latex/enumitem",
"content": "\\ProvidesPackage{enumitem}..."
}
},
"totalFiles": 1,
"dependencies": ["keyval"],
"source": "texlive"
}Get package metadata from CTAN (cached).
curl http://localhost:8081/api/pkg/enumitemGet recursive dependencies for a package.
curl http://localhost:8081/api/deps/tikzGet current cache statistics.
curl http://localhost:8081/api/statsResponse:
{
"memory": {
"packages": { "current": 23, "max": 100 },
"info": { "current": 45, "max": 500 },
"aliases": { "current": 12, "max": 1000 }
},
"disk": {
"cacheDir": "./packages/cache",
"packages": 87,
"fileIndex": 156
},
"inFlight": 0
}All settings are configurable via environment variables:
| Variable | Default | Description |
|---|---|---|
CTAN_PROXY_PORT |
8081 |
Server port |
CTAN_PROXY_CACHE_DIR |
./packages/cache |
Disk cache directory |
CTAN_PROXY_MEMORY_CACHE_SIZE |
100 |
Max packages in memory LRU cache |
CTAN_PROXY_INFO_CACHE_SIZE |
500 |
Max CTAN info entries in memory |
CTAN_PROXY_ALIAS_CACHE_SIZE |
1000 |
Max package aliases in memory |
Example with custom settings:
CTAN_PROXY_PORT=9000 \
CTAN_PROXY_MEMORY_CACHE_SIZE=500 \
CTAN_PROXY_CACHE_DIR=/var/cache/ctan \
bun packages/ctan-proxy.tsThe memory cache prevents repeated disk reads. For production with many concurrent users:
- Small memory (default 100): Lower RAM, more disk reads
- Large memory (500-1000): Higher RAM, fewer disk reads
CTAN has ~6000 packages. If your users access a wide variety, increase the cache size. If most users compile similar documents (academic papers, resumes), the default is fine.
The disk cache is unlimited and permanent. Packages are never evicted from disk.
Located in CTAN_PROXY_CACHE_DIR (default: packages/cache/):
packages/cache/
├── enumitem.json # Extracted package data
├── geometry.json
├── tikz.json
├── _aliases.json # Package name aliases (e.g., tikz → pgf)
└── _file_index.json # Reverse index: filename → package
The disk cache is permanent. To clear it, delete the cache directory.
Three LRU caches in memory:
- Package cache - Extracted package data (largest)
- Info cache - CTAN metadata responses
- Alias cache - Package name mappings
All are bounded and evict least-recently-used entries when full.
Concurrent requests for the same package share a single fetch. If 10 users request tikz simultaneously, only one CTAN request is made.
The proxy handles several edge cases:
Some packages are distributed as part of larger packages. For example, pgfkeys is part of pgf. The proxy:
- Queries CTAN for package info
- Detects
texliveormiktexfield pointing to parent - Fetches the parent package instead
- Caches an alias for future requests
Some CTAN packages are single .sty files (not archives). The proxy detects ctan.file === true and fetches the raw file directly.
When CTAN doesn't recognize a package name (e.g., pgfkeys), the proxy searches its file index to find which cached package contains that file.
FROM oven/bun:1
WORKDIR /app
# Copy package files and install dependencies
COPY package.json bun.lock* ./
RUN bun install --production
# Copy proxy files
COPY packages/ctan-core.ts packages/
COPY packages/ctan-proxy.ts packages/
RUN mkdir -p /var/cache/ctan
ENV CTAN_PROXY_CACHE_DIR=/var/cache/ctan
ENV CTAN_PROXY_MEMORY_CACHE_SIZE=500
EXPOSE 8081
CMD ["bun", "packages/ctan-proxy.ts"]The proxy is designed to work in Cloudflare Workers with modifications:
- Replace disk cache with KV or R2 storage
- Replace
exec(tar extraction) with pure JS/WASM decompression
[Unit]
Description=CTAN Proxy
After=network.target
[Service]
Type=simple
User=www-data
WorkingDirectory=/opt/siglum-engine
Environment=CTAN_PROXY_CACHE_DIR=/var/cache/ctan
Environment=CTAN_PROXY_MEMORY_CACHE_SIZE=500
ExecStart=/usr/local/bin/bun packages/ctan-proxy.ts
Restart=always
[Install]
WantedBy=multi-user.targetNote: The working directory must contain both packages/ctan-proxy.ts and packages/ctan-core.ts, plus node_modules with dependencies (fflate).
If a package isn't found:
- Check if it exists on CTAN:
https://ctan.org/pkg/PACKAGENAME - Check if it's part of a parent package (e.g.,
pgfkeys→pgf) - Some packages are in TexLive but not CTAN (or vice versa)
First fetch for a package may take 2-5 seconds (network latency to TexLive/CTAN mirrors). Subsequent requests are instant from cache.
If you see errors, try clearing the cache:
rm -rf packages/cache/*