Add HTTP Transport Support with Host/Port Configuration and Dockerfile#63
Add HTTP Transport Support with Host/Port Configuration and Dockerfile#63shemreader wants to merge 5 commits intoacryldata:mainfrom
Conversation
- Add --host and --port CLI options for remote deployment - Create Dockerfile for containerization - Support remote HTTP streaming transport for Kubernetes deployments
There was a problem hiding this comment.
This is the final PR Bugbot will review for you during this billing cycle
Your free Bugbot reviews will reset on December 27
Details
Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
| show_banner=False, | ||
| host=host, | ||
| port=port, | ||
| ) |
There was a problem hiding this comment.
Bug: SSE binds to all interfaces by default
The new --host default of 0.0.0.0 is applied to sse (and http) runs, so selecting sse now exposes the server on all network interfaces by default. This is a behavior change from typical localhost defaults and can unintentionally make a local dev server remotely reachable.
Additional Locations (1)
alexsku
left a comment
There was a problem hiding this comment.
Thank you for this contribution. However, this PR cannot be merged as-is due to a critical security issue.
🚨 Security Issue: Missing Authentication Violates MCP Protocol
This implementation exposes the DataHub MCP server over HTTP without authentication, which violates the MCP security specification. According to the MCP Security Best Practices:
- MCP servers implementing authorization MUST verify all inbound requests
- Token passthrough is explicitly forbidden
- HTTP-based MCP servers must follow OAuth 2.0 Security Best Practices (RFC 9700)
The Problem
- Binds to
0.0.0.0by default (all network interfaces) - No authentication - anyone who can reach the server can execute arbitrary DataHub queries
- All requests use the same
DataHubClient.from_env()credentials regardless of who makes them - Exposes sensitive metadata: schemas, lineage, PII-tagged data, usage patterns
This is unacceptable for a general-purpose tool like DataHub MCP server.
✅ The Right Approach: Library Pattern
Instead of adding HTTP transport to the CLI, we should expose DataHub tools as a library that users can register in their own FastMCP services with their own authentication.
Architecture
DataHub package (mcp-server-datahub):
- Provides
register_datahub_tools()function - CLI remains simple (stdio only, local development)
- Published to PyPI as before
User's service (in their own codebase):
- Creates FastMCP instance with authentication
- Calls
register_datahub_tools()to register DataHub tools - Deploys with their own security requirements
Implementation
1. Export registration function:
# src/mcp_server_datahub/__init__.py
from mcp_server_datahub.mcp_server import register_datahub_tools
__all__ = ["register_datahub_tools"]2. Refactor register_all_tools() to accept FastMCP instance:
# src/mcp_server_datahub/mcp_server.py
def register_datahub_tools(
mcp_instance: FastMCP,
datahub_client: DataHubClient,
is_oss: bool = False
) -> None:
"""Register DataHub MCP tools on a user-provided FastMCP instance.
Args:
mcp_instance: FastMCP instance to register tools on
datahub_client: DataHub client for API access
is_oss: Whether to use OSS-compatible tool descriptions
"""
set_datahub_client(datahub_client)
mcp_instance.tool(name="search", description=...)(async_background(search))
mcp_instance.tool(name="get_lineage", description=...)(async_background(get_lineage))
# ... register all tools on provided instance3. Users create their own authenticated service:
# User's codebase: my_company_mcp/main.py
from fastmcp import FastMCP
from fastmcp.server.auth.providers.jwt import JWTVerifier
from datahub.sdk.main_client import DataHubClient
from mcp_server_datahub import register_datahub_tools
# User controls authentication
auth = JWTVerifier(
jwks_uri="https://auth.mycompany.com/.well-known/jwks.json",
issuer="https://auth.mycompany.com",
audience="mycompany-mcp"
)
mcp = FastMCP(name="MyCompany MCP Server", auth=auth)
# Register DataHub tools
client = DataHubClient.from_env()
register_datahub_tools(mcp, client, is_oss=True)
# Run with authentication
if __name__ == "__main__":
mcp.run(transport="http", host="0.0.0.0", port=8000)4. Users deploy their own service:
# User's Dockerfile
FROM python:3.10-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt # Includes mcp-server-datahub
COPY my_company_mcp/ ./my_company_mcp/
CMD ["python", "-m", "my_company_mcp.main"]Why This Is Better
- Security: Users implement authentication in their environment
- Flexibility: Users control deployment, middleware, and configuration
- Composability: Users can combine DataHub tools with their own tools
- Compliance: Follows MCP specification and FastMCP best practices
- Separation of concerns: DataHub provides tools, users handle deployment security
Required Changes
-
Refactor to library:
- Change
register_all_tools()→register_datahub_tools(mcp_instance, client, is_oss) - Export from
__init__.py - Internal
__main__.pycalls the new function (no other changes needed)
- Change
-
Remove from this PR:
- Drop
--hostand--portCLI options (not needed - users write their own service) - Drop Dockerfile (users create their own with authentication)
- Drop
-
Documentation:
- Add "Library Usage" guide with authentication examples
- Show JWT, OAuth, and other auth provider examples
- Link to FastMCP authentication docs
Publishing
No package rename needed. mcp-server-datahub will work as both CLI and library:
# Install once
pip install mcp-server-datahub
# Use as CLI (local development, stdio)
mcp-server-datahub
# Use as library (production, HTTP with auth)
from mcp_server_datahub import register_datahub_toolsReferences
|
hi @alexsku, my 2c... if the concern is about http - that's policy of the specific organization whether to use it or not, btw, many MCP Gateways solution today handling the access for the http by the user using jwt token that eliminates the need for having MCP authentication as not all MCP servers supports the standard you are demanding, same as this one doesn't support it, your suggestion making lots of sense but, that's on top of supporting remote mcp which this PR by @shemreader is about. |
Summary
This PR adds HTTP transport support for remote MCP server deployments, enabling the DataHub MCP server to run as a standalone HTTP service accessible over the network.
Changes
HTTP Transport with Configurable Host/Port (main.py)
Added --host option (default: 0.0.0.0) to bind the server to a specific network interface
Added --port option (default: 8000) to configure the listening port
HTTP transport runs in stateless mode for better scalability
Dockerfile for Containerized Deployment
Added multi-stage Dockerfile using python:3.10-slim base image
Uses uv for fast, reproducible dependency installation
Exposes port 8000 by default
Note
Adds configurable host/port for HTTP/SSE transports and introduces a Dockerfile to run the server over HTTP on port 8000.
src/mcp_server_datahub/__main__.pywith--hostand--portoptions; pass tomcp.run()forhttpandssetransports.stateless_http=True) forhttptransport.--host/--portare provided withstdiotransport.Dockerfileusingpython:3.10-slimanduvfor deps; copy project files,uv sync --frozen --no-dev.8000and set CMD to runmcp-server-datahubwith--transport http --host 0.0.0.0 --port 8000.Written by Cursor Bugbot for commit d813bb6. This will update automatically on new commits. Configure here.