Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ ipython_config.py

# Environments
.env
.envrc
.venv
env/
venv/
Expand Down
38 changes: 28 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,11 +40,7 @@ Add this to your MCP Settings file:
"mcpServers": {
"databeak": {
"command": "uvx",
"args": [
"--from",
"git+https://github.com/jonpspri/databeak.git",
"databeak"
]
"args": ["databeak"]
}
}
}
Expand All @@ -58,14 +54,18 @@ specific configuration examples.

### HTTP Mode (Advanced)

For HTTP-based AI clients or custom deployments:
For HTTP-based AI clients or custom deployments with OIDC authentication:

```bash
# Run in HTTP mode
uv run databeak --transport http --host 0.0.0.0 --port 8000

# Access server at http://localhost:8000/mcp
# Health check at http://localhost:8000/health
uvx databeak --transport http --host 0.0.0.0 --port 8000

# With OIDC authentication
export DATABEAK_OIDC_CONFIG_URL="https://auth.example.com/.well-known/openid-configuration"
export DATABEAK_OIDC_CLIENT_ID="databeak-client"
export DATABEAK_OIDC_CLIENT_SECRET="your-secret"
export DATABEAK_OIDC_BASE_URL="https://databeak.example.com"
uvx databeak --transport http --host 0.0.0.0 --port 8000
```

### Quick Test
Expand Down Expand Up @@ -97,6 +97,8 @@ Once configured, ask your AI assistant:
Configure DataBeak behavior with environment variables (all use `DATABEAK_`
prefix):

### Core Configuration

| Variable | Default | Description |
| ------------------------------------- | --------- | ---------------------------------- |
| `DATABEAK_SESSION_TIMEOUT` | 3600 | Session timeout (seconds) |
Expand All @@ -106,6 +108,22 @@ prefix):
| `DATABEAK_URL_TIMEOUT_SECONDS` | 30 | URL download timeout |
| `DATABEAK_HEALTH_MEMORY_THRESHOLD_MB` | 2048 | Health monitoring memory threshold |

### Authentication (HTTP Mode Only)

For HTTP mode deployments, DataBeak supports OpenID Connect authentication. All
four variables must be set to enable OIDC (not applicable for stdio mode):

| Variable | Required | Description |
| ----------------------------- | -------- | -------------------------------- |
| `DATABEAK_OIDC_CONFIG_URL` | Yes | OIDC discovery configuration URL |
| `DATABEAK_OIDC_CLIENT_ID` | Yes | OAuth2 client ID |
| `DATABEAK_OIDC_CLIENT_SECRET` | Yes | OAuth2 client secret |
| `DATABEAK_OIDC_BASE_URL` | Yes | Application base URL for OAuth2 |

**Note**: OIDC authentication is only used in HTTP transport mode. If any OIDC
variable is set but not all four, DataBeak will log an error and authentication
will not be enabled.

See [settings.py](src/databeak/core/settings.py) for complete configuration
options.

Expand Down
83 changes: 46 additions & 37 deletions docs/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@ and client configuration.
The fastest way to install and run DataBeak:

```bash
# Install and run directly from GitHub
uvx --from git+https://github.com/jonpspri/databeak.git databeak
# Install and run from PyPI
uvx databeak
```

### Using uv
Expand All @@ -47,8 +47,8 @@ uv run databeak
### Using pip

```bash
# Install directly from GitHub
pip install git+https://github.com/jonpspri/databeak.git
# Install from PyPI
pip install databeak

# Run the server
databeak
Expand All @@ -68,15 +68,10 @@ Settings):
"mcpServers": {
"databeak": {
"command": "uvx",
"args": [
"--from",
"git+https://github.com/jonpspri/databeak.git",
"databeak"
],
"args": ["databeak"],
"env": {
"DATABEAK_MAX_FILE_SIZE_MB": "2048",
"DATABEAK_SESSION_TIMEOUT": "7200",
"DATABEAK_CHUNK_SIZE": "20000"
"DATABEAK_MAX_DOWNLOAD_SIZE_MB": "200",
"DATABEAK_SESSION_TIMEOUT": "7200"
}
}
}
Expand All @@ -92,11 +87,7 @@ Edit `~/.continue/config.json`:
"mcpServers": {
"databeak": {
"command": "uvx",
"args": [
"--from",
"git+https://github.com/jonpspri/databeak.git",
"databeak"
]
"args": ["databeak"]
}
}
}
Expand All @@ -111,11 +102,7 @@ Add to VS Code settings (`settings.json`):
"cline.mcpServers": {
"databeak": {
"command": "uvx",
"args": [
"--from",
"git+https://github.com/jonpspri/databeak.git",
"databeak"
]
"args": ["databeak"]
}
}
}
Expand All @@ -130,11 +117,7 @@ Edit `~/.windsurf/mcp_servers.json`:
"mcpServers": {
"databeak": {
"command": "uvx",
"args": [
"--from",
"git+https://github.com/jonpspri/databeak.git",
"databeak"
]
"args": ["databeak"]
}
}
}
Expand All @@ -149,11 +132,7 @@ Edit `~/.config/zed/settings.json`:
"experimental.mcp_servers": {
"databeak": {
"command": "uvx",
"args": [
"--from",
"git+https://github.com/jonpspri/databeak.git",
"databeak"
]
"args": ["databeak"]
}
}
}
Expand All @@ -163,6 +142,8 @@ Edit `~/.config/zed/settings.json`:

Configure DataBeak behavior with these environment variables:

### Core Configuration

| Variable | Default | Description |
| --------------------------------------------- | ------- | ---------------------------------------- |
| `DATABEAK_MAX_FILE_SIZE_MB` | 1024 | Maximum file size in MB |
Expand All @@ -175,6 +156,36 @@ Configure DataBeak behavior with these environment variables:
| `DATABEAK_MAX_VALIDATION_VIOLATIONS` | 1000 | Max validation violations to report |
| `DATABEAK_MAX_ANOMALY_SAMPLE_SIZE` | 10000 | Max sample size for anomaly detection |

### HTTP Mode Configuration

For HTTP transport mode (`--transport http`), additional configuration options
are available:

#### OIDC Authentication (HTTP Mode Only)

OpenID Connect authentication for secure HTTP deployments. All four variables
must be set to enable OIDC:

| Variable | Required | Description |
| ----------------------------- | -------- | -------------------------------- |
| `DATABEAK_OIDC_CONFIG_URL` | Yes | OIDC discovery configuration URL |
| `DATABEAK_OIDC_CLIENT_ID` | Yes | OAuth2 client ID |
| `DATABEAK_OIDC_CLIENT_SECRET` | Yes | OAuth2 client secret |
| `DATABEAK_OIDC_BASE_URL` | Yes | Application base URL for OAuth2 |

**Example HTTP deployment with OIDC:**

```bash
export DATABEAK_OIDC_CONFIG_URL="https://auth.example.com/.well-known/openid-configuration"
export DATABEAK_OIDC_CLIENT_ID="databeak-client"
export DATABEAK_OIDC_CLIENT_SECRET="your-secret"
export DATABEAK_OIDC_BASE_URL="https://databeak.example.com"
uvx databeak --transport http --host 0.0.0.0 --port 8000
```

**Note**: OIDC authentication is only applicable for HTTP transport mode. Stdio
mode (default for MCP clients) does not use OIDC authentication.

## Verification

### Test the Installation
Expand All @@ -194,8 +205,7 @@ DATABEAK_LOG_LEVEL=DEBUG uv run databeak
npm install -g @modelcontextprotocol/inspector

# Test the server
mcp-inspector uvx --from \
git+https://github.com/jonpspri/databeak.git databeak
mcp-inspector uvx databeak
```

### Verify in Your AI Client
Expand All @@ -210,9 +220,8 @@ mcp-inspector uvx --from \

#### Server not starting

- Check Python version: `python --version` (must be 3.10+)
- Verify installation:
`uvx --from \ git+https://github.com/jonpspri/databeak.git databeak --version`
- Check Python version: `python --version` (must be 3.12+)
- Verify installation: `uvx databeak --version`
- Check logs with debug level

#### Client can't connect
Expand Down
6 changes: 6 additions & 0 deletions src/databeak/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,13 @@

__author__ = "Jonathan Springer"

import logging

from ._version import __version__
from .core.settings import get_settings
from .server import main

logging.getLogger("databeak").setLevel(get_settings().log_level)
logging.getLogger("mcp").setLevel(get_settings().log_level)

__all__ = ["__version__", "main"]
7 changes: 5 additions & 2 deletions src/databeak/core/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
import threading

from pydantic import Field
from pydantic_settings import BaseSettings
from pydantic_settings import BaseSettings, SettingsConfigDict


class DatabeakSettings(BaseSettings):
Expand All @@ -32,6 +32,9 @@ class DatabeakSettings(BaseSettings):
- Various thresholds for quality checks and anomaly detection
"""

# Logging
log_level: str = Field(default="INFO", description="Logging level")

# Session management
session_timeout: int = Field(default=3600, description="Session timeout in seconds")
session_capacity_warning_threshold: float = Field(
Expand Down Expand Up @@ -110,7 +113,7 @@ class DatabeakSettings(BaseSettings):
default=100, description="Multiplier for converting ratios to percentages"
)

model_config = {"env_prefix": "DATABEAK_", "case_sensitive": False}
model_config = SettingsConfigDict(env_prefix="DATABEAK_")


_settings: DatabeakSettings | None = None
Expand Down
Loading