time-causal-vae is a compact research package for no-anticipation financial time-series
generation. It keeps the continuous Time-Causal VAE baseline and adds a public discrete pipeline:
a causal VQ tokenizer followed by a causal autoregressive token prior.
The GitHub repository may remain TimeCausalVQVAE because it hosts the VQ-discrete extension of the TimeCausalVAE package. The Python distribution remains time-causal-vae, and the import path remains time_causal_vae. The public branch focuses on S&P500/VIX, with Black-Scholes, Heston, and path-dependent-volatility configs retained as small baseline and smoke workflows. Generated outputs, local data, checkpoints, W&B runs, arrays, pickles, and notebooks with outputs are not committed.
The package is organised around five steps:
- Continuous TC-VAE baseline: train and evaluate no-anticipation continuous VAE models for the benchmark financial datasets.
- Causal VQ tokenizer: encode each time step into discrete latent codes using causal convolutional encoders and decoders.
- Causal token prior: model token sequences with a causal autoregressive prior conditioned on available scalar context such as VIX.
- Financial diagnostics: compare generated and real paths with distributional, autocorrelation, volatility, tail, drawdown, and VIX-bucket diagnostics.
- Discrete latent geometry: inspect codebook usage, codebook projections, VIX-bucket usage, and example token trajectories.
The project uses Poetry. For the full development environment:
poetry installFor a lean runtime install:
poetry install --only mainDependency groups are defined for development, notebooks, and tracking:
dev: Ruff, mypy, pre-commit, Commitizen, nbstripout, and IPython kernel support.notebooks: Jupyter, seaborn, torchview, and graphviz.tracking: optional W&B logging support.
Inspect the public configs without importing model code:
poetry run python scripts/inspect_selected_configs.pyInspect the registered selected-model metadata used by notebooks and scripts:
poetry run python scripts/select_registered_model.py --experiment sp500_vix --family discrete
poetry run python scripts/select_registered_model.py --experiment sp500_vix --family discrete --metric mmdThe registry is trained_models/model_registry.yaml. It records selected continuous and discrete
config paths, local checkpoint conventions, selection profiles, visible metrics, and missing
metrics. These entries are the current registry selections for public workflows, not universal
mathematical optima. The registry does not contain weights. Commands that train or evaluate models
create local outputs/ directories for checkpoints and artefacts.
Run a minimal continuous S&P500/VIX smoke command:
poetry run tcvae-train \
--config configs/experiments/sp500_vix_beta_cvae.yaml \
--output-dir outputs/sp500_vix_continuous \
--epochs 1 \
--no-wandb \
--dry-runRemove --dry-run only when you intentionally want to train.
The public S&P500/VIX discrete baseline is:
configs/experiments/sp500_vix_causal_vq_tokenizer.yaml
configs/experiments/sp500_vix_causal_token_prior_additive.yaml
Train the tokenizer, extract tokens, train the prior, and run diagnostics:
poetry run tcvae-train-tokenizer \
--config configs/experiments/sp500_vix_causal_vq_tokenizer.yaml \
--base-data-dir data/processed \
--output-dir outputs/sp500_vix_discrete/tokenizer \
--no-wandb
poetry run python scripts/extract_token_indices.py \
--config configs/experiments/sp500_vix_causal_vq_tokenizer.yaml \
--tokenizer-dir <tokenizer-dir> \
--output-dir outputs/sp500_vix_discrete/token_prior/tokens_codebook64_codebookdim16 \
--base-data-dir data/processed
poetry run tcvae-train-token-prior \
--config configs/experiments/sp500_vix_causal_token_prior_additive.yaml \
--output-dir outputs/sp500_vix_discrete/token_prior/additive \
--no-wandb
poetry run python scripts/evaluate_sp500_vix_paper_style.py \
--discrete-config configs/experiments/sp500_vix_causal_token_prior_additive.yaml \
--discrete-prior-dir <prior-dir> \
--discrete-tokenizer-dir <tokenizer-dir> \
--continuous-config configs/experiments/sp500_vix_beta_cvae.yaml \
--continuous-model-dir <continuous-final-model-dir> \
--output-dir outputs/sp500_vix_discrete/paper_style \
--base-data-dir data/processed \
--n-sample 1000 \
--temperature 0.8 \
--top-k 40The S&P500/VIX data file is expected at:
data/processed/sp500vix/sp500vix_normalized.npy
This file is local and is not committed. Training and evaluation commands create local outputs/
directories.
The Hawkes/SVMHJD synthetic benchmark adds clustered, marked jump paths for stress-testing
rare-event behaviour. The Ogata backend simulates continuous-time marked Hawkes events before
projecting them to the fixed observation grid, while the fixed-grid backend remains available for
fast smoke tests. Jump-specific diagnostics and leakage checks are included under scripts/, and
the public note is docs/benchmarks/hawkes_jump.md.
Hawkes/SVMHJD has an optional research-candidate registry entry with status: research_candidate and public_default: false. The S&P500/VIX workflow remains the public default
demo. The selected Hawkes/SVMHJD discrete research candidate remains the hidden128 log-return cb64
tokenizer + causal conv-transformer k3 prior. The additive AR prior is the required jump-profile
ablation, and the tiny conv-transformer is an efficiency candidate that improves
jump-count/inter-arrival means but loses the balanced smooth profile. Continuous comparators use
the log-return BetaCVAE and InfoCVAE configurations.
The benchmark remains a scenario-data stress test, not an arbitrage-free pricing model. No Hawkes/SVMHJD trained weights, checkpoints, token tensors, generated samples, W&B exports, or output summaries are committed.
All modelling paths preserve no anticipation: at time t, encoders, tokenizers, priors, and
diagnostics should only use observations and conditions available up to that point. The causal
convolution checks and token-prior checks in scripts/ are small public guards for this contract.
The public discrete baseline uses:
- a standard VQ tokenizer with one code per time step;
- causal convolutional encoder and decoder stacks;
- a scalar-conditioned causal autoregressive token prior;
- additive conditioning for the S&P500/VIX public prior.
The hidden128 causal conv-transformer k3 prior is a stronger research variant, not the default quickstart path. Its closest architectural reference is TCCT, which combines causal convolutional locality with Transformer modelling for time-series forecasting. This project uses the idea only as context for an autoregressive discrete-token prior, not as a forecasting model. RVQ q2 was evaluated on research branches and is not part of the public baseline. Diffusion and transition-constrained sampling were also evaluated during development, but they remain deferred.
time_causal_vae.data: synthetic and market dataset loaders, transforms, and data-pipeline helpers.time_causal_vae.models.continuous: continuous TC-VAE encoders, decoders, conditioners, objectives, priors, factory helpers, losses, transforms, and distance helpers.time_causal_vae.models.discrete: causal VQ tokenizer encoders and decoders, VQ-family quantizer adapters, token-prior configs, masks, autoregressive priors, and sampling utilities.time_causal_vae.models.layers: shared causal layers used across model families.time_causal_vae.evaluation: financial diagnostics, plotting, model selection, checkpoint compatibility, token diagnostics, and latent-geometry helpers.time_causal_vae.experiments: portable experiment config loading, continuous-config adaptation, selection profiles, and trained-model registry selection.notebooks: continuous, discrete, and report-facing demonstration notebooks.scripts: config inspection, reproduction wrappers, token extraction, evaluation, latent geometry, and no-leakage checks.
This project refactors and preserves selected parts of the original Time-Causal VAE implementation for public reproducibility. In particular, it keeps the continuous TC-VAE baseline structure and selected configs, the dataset conventions for Black-Scholes, Heston, PDV, and S&P500/VIX, and a subset of the financial diagnostics used for generated-vs-real path comparison.
Some optional or external evaluation helpers are retained under
src/time_causal_vae/evaluation/external/ and src/time_causal_vae/evaluation/finance/. These
helpers are included for compatibility and comparison; they should not be read as entirely new
work in this repository.
Notebook demos are grouped by role:
notebooks/benchmarks/: Ogata's Modified Thinning Algorithm and fixed-grid approximation comparison.notebooks/continuous/: continuous TC-VAE demos for Black-Scholes, Heston, PDV, Hawkes/SVMHJD, and S&P500/VIX.notebooks/discrete/: public discrete tokenizer and token-prior demos, including latent geometry.notebooks/report/: figure-manifest notebooks that read from localoutputs/paths.
Committed notebooks should stay output-stripped. The notebooks print guarded commands by default and should not train or evaluate unless their run flags are deliberately enabled.
Small public demo figures are committed under assets/figures/. They are curated from local
TimeCausalVQVAE runs and are not copied from the original TC-VAE repository.
The report notebook notebooks/report/final_sample_geometry_report.ipynb creates t-SNE and
KDE/ECDF diagnostics for registered continuous and discrete candidates when local output batches
are available. t-SNE is qualitative only; KDE/ECDF panels are the preferred view for financial
feature retention. Generated report figures are written locally under
outputs/final_sample_geometry_report/ and are not committed by default.
Trained-model metadata is documented in trained_models/model_registry.yaml and summarised by
the model cards under trained_models/<experiment>/model_card.md. Notebooks can use the registry
to auto-select the registered metadata instead of hard-coding an optimal checkpoint or config.
Weights remain local or external and are not committed.
- Time-Causal VAE: Robust Financial Time Series Generator - Beatrice Acciaio, Stephan Eckstein, and Songyan Hou. arXiv DOI: 10.48550/arXiv.2411.02947; code: justinhou95/TimeCausalVAE. Adapted part: no-anticipation TC-VAE baseline, selected financial diagnostics, conditional PDV and S&P500/VIX setup.
- Neural Discrete Representation Learning - Aaron van den Oord, Oriol Vinyals, and Koray Kavukcuoglu. arXiv DOI: 10.48550/arXiv.1711.00937. Adapted part: VQ-VAE-style discrete latent codes, commitment loss, and tokenizer-prior separation.
- Vector Quantized Time Series Generation with a Bidirectional Prior Model - Daesoo Lee, Sara Malacarne, and Erlend Aune. PMLR 206; arXiv DOI: 10.48550/arXiv.2303.04743; code: ML4ITS/TimeVQVAE. Adapted part: two-stage VQ time-series generation reference. The prior in this package remains causal.
- vector-quantize-pytorch - lucidrains. Repository: lucidrains/vector-quantize-pytorch. Adapted part: VQ-family backend implementations wrapped behind local tokenizer adapters.
- Residual Quantization with Implicit Neural Codebooks - arXiv DOI: 10.48550/arXiv.2401.14732. Adapted part: residual-quantization background only.
- MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization - arXiv DOI: 10.48550/arXiv.2507.07997. Adapted part: future grouped-tokenizer motivation only.
- DeepVol: Volatility Forecasting from High-Frequency Data with Dilated Causal Convolutions - arXiv DOI: 10.48550/arXiv.2210.04797. Adapted part: dilated causal convolution motivation for financial time-series encoders.
- TCCT: Tightly-Coupled Convolutional Transformer on Time Series Forecasting - Li Shen and Yangzhu Wang. DOI: 10.1016/j.neucom.2022.01.039. Adapted part: architectural context for local causal convolution plus Transformer modelling in the hidden128 conv-transformer research prior.
- Vector Quantized Diffusion Model for Text-to-Image Synthesis - arXiv DOI: 10.48550/arXiv.2111.14822. Status: deferred; used only as discrete diffusion background.
- Causal Diffusion Transformers for Generative Modeling - arXiv DOI: 10.48550/arXiv.2412.12095. Status: deferred; used only as causal diffusion background.
- aotnumerics - Stephan Eckstein. Repository: stephaneckstein/aotnumerics. Adapted part: adapted/causal optimal transport background; no vendored implementation is used in the public baseline.
- Chronos: Learning the Language of Time Series - arXiv DOI: 10.48550/arXiv.2403.07815, code: amazon-science/chronos-forecasting. Adapted part: contrast with forecasting foundation models and scalar-value tokenisation.
- Sig-Wasserstein-GANs - Ni et al. (2021). Adapted part: optional expected-signature metric background inherited from upstream evaluation helpers.
- Randomised-Signature-TimeSeries-Generation - repository: niklaswalter/Randomised-Signature-TimeSeries-Generation. Adapted part: optional signature-metric and neural-SDE background. These routines are not part of the public quickstart path.




