A validation toolkit for building custom database connectors. You implement a set of base classes — providing connection logic and Jinja SQL templates for your database dialect — then run the included test suite to verify correctness and discover which metrics and capabilities your connector supports. The end result is a generic agent image that you host, deploy, and then register in Monte Carlo.
Supports multiple connectors side by side so you can build and test several at once.
An AI coding agent can handle the entire workflow — from scaffolding and driver installation to implementing all ~100 template methods, running tests, and building the deployable image. You just provide the database credentials.
The repo includes four skills that automate the full workflow end-to-end:
| Step | Skill | What it does |
|---|---|---|
| 1 | /create-connector <name> |
Scaffold a new connector directory |
| 2 | /setup-connection <name> |
Install driver, implement connection methods, stub credentials.json — pauses for you to fill in credentials |
| 3 | /implement-connector <name> [hybrid] |
Implement all template methods section by section |
| 4 | /build-agent-image <name> [--mode MODE] |
Export capabilities and build deployable Docker image |
The only manual step is filling in credentials.json when /setup-connection pauses. Everything else — scaffolding, driver installation, template implementation, testing, and image building — is handled by the skills.
If you're not using Claude Code, complete steps 1–6 of Quick Start below to set up connectivity, then provide AGENTS.md as context to your LLM along with the connector name. The agent will implement all remaining template methods, run tests iteratively, and export capabilities. Resume at step 10 to build the deployable image.
python scripts/create_connector.py <name>This creates connectors/<name>/ with:
connector.py— base classes to implement (copy of the canonical template)manifest.json— uniqueconnection_typeidentifiercredentials.json— database credentials (gitignored)requirements.txt— database driver dependenciesDockerfile.extra— system dependency instructions (empty by default)
Edit connectors/<name>/connector.py and fill in the base classes:
| Class | Purpose |
|---|---|
BaseConnector |
Connection lifecycle — create_connection, create_cursor, execute_query, fetch_all_results, close_connection |
MetadataQueryTemplates |
Jinja templates for discovering databases, schemas, tables, and columns |
QueryLogCollectionTemplates |
Jinja template for fetching query logs |
CustomSQLMonitorTemplates |
Jinja templates for custom SQL monitor operations (count wrapping, row limits) |
QueryLanguageTemplates |
~90 Jinja templates covering type casting, date/time functions, aggregations, comparisons, string operations, and more |
FunctionalTestOperations |
(Optional) Jinja templates for functional validation — DDL/DML operations (create/drop table, insert rows, add/drop columns) that let the test suite run metadata collection before and after each mutation to confirm metrics actually update. This validates that your metadata sources reflect real-time changes. See Functional Validation Tests for details. |
Every template method returns a template string. Most use format-string placeholders like {x} (substituted later by the backend); some use Jinja {{ variable }} syntax. For example:
def get_avg_function_template(self) -> str:
return "AVG({x})" # placeholder — {x} substituted later
def get_casting_to_numeric_expression_template(self) -> str:
return "CAST({{ expression }} AS NUMERIC)" # Jinja variable — rendered at template timeEach method's docstring documents which pattern it uses, the expected variables, and example implementations for common databases. See How Templates Work for details.
Add your driver to connectors/<name>/requirements.txt:
psycopg2-binary==2.9.9
Then rebuild the Docker image:
docker compose buildIf your database driver requires system-level libraries (ODBC drivers, native clients, etc.), add the installation commands to connectors/<name>/Dockerfile.extra:
RUN apt-get update && apt-get install -y --no-install-recommends \
unixodbc-dev \
&& apt-get clean && rm -rf /var/lib/apt/lists/*Then regenerate the test Dockerfile:
python scripts/generate_test_dockerfile.pyThe Dockerfile.extra contents are injected into both the test image and the deployable agent image. The create_connector.py script and the /setup-connection skill regenerate the test Dockerfile automatically — you only need to run the command above if you edit Dockerfile.extra manually after initial setup.
Dockerfile.extra supports RUN, ENV, and ARG instructions. COPY is not supported because the agent image builds in a temporary directory.
Add your database credentials to connectors/<name>/credentials.json:
{
"connect_args": {
"host": "localhost",
"port": 5432,
"database": "mydb",
"user": "myuser",
"password": "mypassword"
}
}The keys in connect_args are whatever your create_connection() method expects via self.credentials:
def create_connection(self):
import psycopg2
return psycopg2.connect(
host=self.credentials["host"],
port=int(self.credentials["port"]),
database=self.credentials["database"],
user=self.credentials["user"],
password=self.credentials["password"],
)This same JSON format is used for self-hosted credentials when deploying — just swap in production values.
docker compose buildSome database drivers include native libraries built for a specific architecture. If you hit errors loading .so files, rebuild with the correct platform:
docker compose build --build-arg TARGETPLATFORM=linux/amd64Rebuild whenever you change requirements.txt or Dockerfile.extra (remember to regenerate the test Dockerfile first if you changed Dockerfile.extra).
CONNECTOR=<name> docker compose run --rm test -m connectionThis runs two quick checks: connection creation and cursor creation. Fix any credential or networking issues before moving on.
If only one connector exists, you can omit CONNECTOR=:
docker compose run --rm test -m connectionWork through each section of connector.py incrementally. Implement the methods for one section, run its corresponding tests, fix any failures, then move on to the next section.
# Metadata collection
CONNECTOR=<name> docker compose run --rm test -m metadata
# Query language prerequisites (needed for metric monitors)
CONNECTOR=<name> docker compose run --rm test -m ql_prerequisites
# Query language metric templates
CONNECTOR=<name> docker compose run --rm test -m ql_metrics
# Custom SQL monitors
CONNECTOR=<name> docker compose run --rm test -m custom_monitors
# Functional validation (optional)
CONNECTOR=<name> docker compose run --rm test -m functionalRebuild the Docker image (docker compose build) after changing connector.py or requirements.txt.
Once all sections pass individually, run the full test suite with --export to generate the manifest and passing templates:
CONNECTOR=<name> docker compose run --rm test --exportNote: --export requires the full test suite (no -m filter).
After a full test run with --export, output/<name>/manifest.json is generated with:
- connection_type — unique identifier for this connector
- connection_name — connector directory name
- capabilities — which features your connector supports (metadata collection, query logs, custom SQL monitors, metric monitors, etc.)
- metrics — which metrics your connector supports, derived from template results and the metrics mapping
Passing templates are exported to output/<name>/templates/.
Once your connector passes tests and templates are exported, package everything into a custom agent image:
python scripts/generate_agent_image.pyThis takes the public montecarlodata/agent:latest-generic image as a base and layers on your connector artifacts. The resulting custom agent image contains:
- Exported templates (
output/<name>/templates/) — the passing Jinja templates - Manifest (
output/<name>/manifest.json) — capabilities and supported metrics - Connector code (
connector.py) — your connection and execution logic - Driver dependencies (
requirements.txt,Dockerfile.extra) — database drivers and system libraries
Credentials are NOT included in the image. Your credentials.json stays local and is never copied into the image. Production credentials are provided at deploy time via self-hosted credentials.
The generic agent is an egress-only agent that works across all supported platforms (Docker Compose, Kubernetes, EKS, AKS, GKE). See Generic Agent Platforms for deployment options.
Options:
| Flag | Default | Description |
|---|---|---|
--version |
latest |
Agent base image version |
--connector |
all with output/ | Which connectors to include (repeatable) |
--docker-platform |
linux/amd64 |
Docker platform for the image |
--tag |
custom-agent:{version}-generic |
Output image tag |
--mode |
auto |
auto, full, or hybrid — see Modes below |
Include specific connectors:
python scripts/generate_agent_image.py --connector postgres --connector mysqlModes:
| Full (default) | Hybrid | |
|---|---|---|
| Metadata & query logs | Collected by the agent | Pushed externally |
| Requires | supports_metadata == true |
supports_custom_sql_monitor == true |
| Metric monitors | Optional (warning if prereqs incomplete) | Optional (warning if prereqs incomplete) |
| Classes to implement | All 5 | BaseConnector + CustomSQLMonitorTemplates (+ QueryLanguageTemplates for metric monitors) |
Full mode (default) — the agent handles metadata collection and metric monitors:
python scripts/generate_agent_image.pyHybrid mode — metadata is pushed externally, the agent only needs metric monitor support:
python scripts/generate_agent_image.py --mode hybridVerify the image:
docker run --rm --entrypoint ls custom-agent:latest-generic /opt/custom-connectors/Then push the agent image to your container registry and deploy. Your local connectors/<name>/credentials.json is already in the format needed for self-hosted credentials — just swap in production values and configure them at deploy time.
When you're done, remove the Docker image and any stopped containers:
docker compose down --rmi localNothing is installed on your machine — everything runs inside the container.
custom-connector-setup/
connectors/
_base/ # Provided — do not edit
connector.py # Canonical template with all base classes
__init__.py # Exports the base classes
<your-database>/ # Created by you (one directory per connector)
connector.py # Your implementation (fill in stubs)
credentials.json # Database credentials (gitignored)
manifest.json # {"connection_type": "custom-connector-xxx", "name": "..."}
requirements.txt # Database driver deps
Dockerfile.extra # System dependency instructions (optional)
output/ # Auto-generated by --export (gitignored)
<your-database>/
manifest.json # Test results and supported features
templates/ # Passing .j2 templates
scripts/ # Provided
create_connector.py # Scaffolding helper (stdlib only)
generate_agent_image.py # Builds deployable custom agent Docker image
generate_test_dockerfile.py # Regenerates root Dockerfile from Dockerfile.extra files
tests/ # Provided — do not edit
conftest.py # Test fixtures (TestConnector, Templates, QueryTestHelper)
capabilities_plugin.py # Pytest plugin — generates manifest.json
test_connection.py # Connection tests
test_metadata_collection.py # Metadata discovery tests
test_custom_monitors.py # Custom SQL monitor tests
test_ql_prerequisites.py # Prerequisite templates for metric monitors
test_ql_metrics.py # Metric-specific templates (AVG, STDDEV, LENGTH, regexp, etc.)
test_functional_validation.py # Functional validation tests (real-time metadata accuracy)
.claude/
skills/ # Claude Code automation skills
create-connector/SKILL.md
setup-connection/SKILL.md
implement-connector/SKILL.md
build-agent-image/SKILL.md
AGENTS.md # Instructions for AI coding agents
pytest.toml # Pytest configuration and markers
requirements.txt # Shared Python dependencies
Dockerfile # Test runner image
docker-compose.yml # Docker Compose configuration
All customer-provided SQL is expressed as Jinja templates running in a sandboxed environment (jinja2.sandbox.ImmutableSandboxedEnvironment). No raw Python code is ingested by the backend — connection and execution logic stays in your deployment only.
Templates produce SQL fragments and come in three flavors:
These receive no Jinja variables. They output Python format-string placeholders like {x} that the backend substitutes later via .format(x=field_name). Because they pass through Jinja untouched (single braces aren't Jinja syntax), the rendered template is the format string itself.
def get_avg_function_template(self) -> str:
return "AVG({x})" # {x} is a literal — NOT a Jinja variable
def get_is_gt_expression_template(self) -> str:
return "{x} > {y}" # two placeholdersThese receive named Jinja variables ({{ var }}) that the backend passes as keyword arguments at render time. Use these when the template needs actual values to produce correct SQL.
def get_casting_to_numeric_expression_template(self) -> str:
return "CAST({{ expression }} AS NUMERIC)"
def add_from_clause_template(self) -> str:
return "{{ select_clause }} FROM {{ from_expression }}"Some templates are hybrid — they combine {x} placeholders with Jinja variables:
def get_in_past_days_expression_template(self) -> str:
return "{x} >= CURRENT_DATE - INTERVAL '{{ days }} days'"No variables at all — the rendered output is always the same string.
def current_timestamp_func_template(self) -> str:
return "CURRENT_TIMESTAMP()"Boolean capability flags are also templates that render to "true" or "false":
def supports_literal_select_template(self) -> str:
return "true" # SELECT 1 works without FROMEach method's docstring documents which pattern it uses and what variables it expects. Read the docstring before implementing.
Tests use the ql fixture (a QueryTestHelper instance) that bridges your connector and templates:
@pytest.mark.template(func="get_avg_function_template")
def test_avg(ql):
data = [{"val": 10}, {"val": 20}, {"val": 30}]
# Placeholder templates: render first, then .format() to substitute {x}
avg_expr = ql.render(ql.templates.get_avg_function_template).format(x="val")
result = ql.select_from_data_source(data, avg_expr)
assert float(result) == pytest.approx(20.0)
@pytest.mark.template(func="get_casting_to_numeric_expression_template")
def test_cast_numeric(ql):
data = [{"val": "42"}]
# Parameterized templates: pass Jinja variables as keyword arguments
cast_expr = ql.render(ql.templates.get_casting_to_numeric_expression_template, expression="val")
result = ql.select_from_data_source(data, cast_expr)
assert float(result) == pytest.approx(42.0)The helper builds CTEs from Python dicts, renders templates, executes queries against your real database, and validates results.
The standard tests verify that your metadata templates return correct data types, but they don't verify that the data is real-time. Functional validation tests catch stale sources (e.g. statistics tables that only update when stats are collected) by making actual database changes and verifying your metadata queries detect them.
The tests create a test table, run metadata collection, mutate the table (insert rows, add columns), run collection again, and assert the changes are detected.
Add a FunctionalTestOperations class to your connector.py. All you need is a table identifier and Jinja templates for basic DDL/DML operations:
class FunctionalTestOperations:
def get_test_table_identifier(self) -> tuple:
return ("my_database", "my_schema", "pandora_functional_test")
def create_test_table_template(self) -> str:
return "CREATE TABLE {{ schema }}.{{ table }} (id SERIAL PRIMARY KEY, value TEXT)"
def insert_rows_template(self) -> str:
return "INSERT INTO {{ schema }}.{{ table }} (value) SELECT 'row_' || g FROM generate_series(1, {{ num_rows }}) g"
def add_column_template(self) -> str:
return "ALTER TABLE {{ schema }}.{{ table }} ADD COLUMN {{ column_name }} {{ column_type }}"
def drop_column_template(self) -> str:
return "ALTER TABLE {{ schema }}.{{ table }} DROP COLUMN {{ column_name }}"
def drop_test_table_template(self) -> str:
return "DROP TABLE IF EXISTS {{ schema }}.{{ table }}"
def create_lineage_query_template(self) -> str:
return "SELECT * FROM {{ schema }}.{{ table }} WHERE 1=0"get_test_table_identifier() returns (database, schema, table) — the single source of truth for the test table identity. The framework injects these as {{ database }}, {{ schema }}, and {{ table }} into every template, so the table name in the SQL always matches what the tests look for in metadata results.
| Test | What it validates |
|---|---|
test_table_discovery_after_create |
New table appears in metadata |
test_table_discovery_after_drop |
Dropped table disappears from metadata |
test_volume_change_after_insert |
row_count increases after insert |
test_byte_count_change_after_insert |
byte_count increases after insert |
test_freshness_change_after_insert |
last_update_time advances after insert |
test_schema_change_after_add_column |
New column appears in column metadata |
test_schema_change_after_drop_column |
Dropped column disappears from column metadata |
test_query_log_capture |
Executed query appears in query logs |
Tests auto-skip when stubs are not implemented or when the relevant feature (row_count, freshness, columns, query logs) is not supported by your connector.
CONNECTOR=<name> docker compose run --rm test -m functionalBy default, generate_agent_image.py pulls the public montecarlodata/agent image from DockerHub as the base. If you need to test against a local or unreleased version of the agent, you can build the apollo-agent image locally and use --base-image to point at it:
# Clone and build the agent locally
git clone https://github.com/monte-carlo-data/apollo-agent.git
cd apollo-agent
docker build -t local-agent .
# Use the local build as the base for your custom image
cd /path/to/custom-connector-setup
python scripts/generate_agent_image.py --base-image local-agentThis is useful for debugging agent-side behavior or verifying your connector works with in-development agent changes before they're published.