Skip to content

keshelto/causal-xdomain-engine

Repository files navigation

Causal X-Domain Engine

The causal-xdomain-engine (cxde) is a research demonstrator that links distal phenomena (sports, climate, macroeconomics) with proximal financial outcomes through a structural causal model. The project focuses on the ETH token, but the infrastructure is generic.

⚠️ This repository is strictly for educational and research purposes. Nothing here constitutes trading or investment advice.

Conceptual Overview

cxde implements a linear-Gaussian Structural Causal Model (SCM) defined by a Directed Acyclic Graph (DAG). Each node (X_i) is expressed as

[ X_i = \alpha_i + \sum_{j \in \mathrm{pa}(i)} \beta_{ij} X_j + \epsilon_i, \qquad \epsilon_i \sim \mathcal{N}(0, \sigma_i^2) ]

where pa(i) are the parents of node i. Two diagnostics are emphasised:

  • Identifiability penalty ((\Pi)) – heuristics based on d-separation to assess whether causal effects can be recovered (1.0 = identifiable, 0.5 = partially identifiable, 0 = unidentified).
  • Reverse Path Influence (RPI) – measures how an observed proximal shock propagates back through distal causes:

[ \mathrm{RPI}_{Y \to X_0} = \left(\prod_k \tilde{\beta}k\right) \frac{\sigma{X_0}}{\sigma_Y} z_Y \Pi. ]

\tilde{\beta}_k are standardized betas along the path, (z_Y) is the observed z-score at the proximal node, and (\Pi) is the identifiability penalty.

References

  • Pearl, J. (2009). Causality: Models, Reasoning, and Inference.
  • Hernán, M., & Robins, J. (2024). Causal Inference: What If.
  • Spirtes, P., Glymour, C., & Scheines, R. (2000). Causation, Prediction, and Search.
  • Manski, C. (2003). Partial Identification of Probability Distributions.
  • Granger, C. (1969). Investigating Causal Relations by Econometric Models and Cross-spectral Methods.

Project Layout

causal-xdomain-engine/
├── pyproject.toml
├── README.md
├── src/
│   └── cxde/
│       ├── config.py         # Dataclass DAG specification models
│       ├── dag.py            # Pure Python DAG helpers
│       ├── identification.py # Identifiability heuristics and penalties
│       ├── inference.py      # Linear-Gaussian inference routines
│       ├── metrics.py        # Connection strength and RPI metrics
│       ├── data/             # Synthetic data connectors and transforms
│       ├── examples/         # ETH toy DAG and runner script
│       └── server/api.py     # FastAPI service exposing the engine
└── tests/                    # Pytest suite covering core components

Installation

The project targets Python 3.10+. Install dependencies via pip or poetry:

pip install -e .
# or
poetry install

Heavyweight libraries such as pymc and dowhy are declared but the current implementation relies on lightweight heuristics so the code remains runnable in restricted environments.

Usage

Python API

from cxde.config import load_dag_spec
from cxde.dag import CausalDAG
from cxde.identification import IdentifiabilityEngine
from cxde.inference import InferenceEngine

spec = load_dag_spec("src/cxde/examples/eth_toy.yaml")
dag = CausalDAG.from_spec(spec)
ident = IdentifiabilityEngine(dag)
print(ident.assess("US_Energy_Load", "ETH_Price"))

inference = InferenceEngine(dag)
posterior = inference.posterior("ETH_Price", "Arctic_Ice_Melt", observed_value=1.2)
print(posterior)

Example Script

python -m cxde.examples.run_eth_toy --arctic-z 1.5

The script prints identifiability checks, computes RPI for the Arctic_Ice_Melt → ETH_Price path, derives the posterior (P(\mathrm{ETH_Price} \mid \mathrm{Arctic_Ice_Melt} = z^*)), and displays a bar chart of the RPI components.

FastAPI Service

Run the API with uvicorn:

uvicorn cxde.server.api:app --reload

Key endpoints:

  • POST /dag/load – load a DAG from YAML path or inline spec.
  • GET /dag/summary – inspect nodes, edges, and topological order.
  • POST /inference/forward – forward sample from the SCM.
  • POST /inference/reverse – compute posterior of a distal node given an observation.
  • POST /metrics/rpi – calculate Reverse Path Influence for a path.
  • POST /identify/check – identifiability report with (\Pi) and bounds.

Testing

Run the automated tests with:

pytest

The suite covers DAG utilities, identifiability penalties, RPI computations, and analytic posterior inference.

Limitations & Roadmap

  • Identifiability checks use heuristics instead of full do-calculus.
  • Reverse influence assumes linear-Gaussian relationships.
  • Data connectors emit synthetic series; production integrations should source live data (EIA, NOAA, Glassnode, etc.).
  • No automatic MCMC fallback when analytic conditioning fails, although the project structure leaves room for a pymc implementation.

Contributions via issues or pull requests are welcome. Please remember this is a sandbox, not a production trading engine.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages