Skip to content

Add HTTP Transport Support with Host/Port Configuration and Dockerfile#63

Open
shemreader wants to merge 5 commits intoacryldata:mainfrom
shemreader:main
Open

Add HTTP Transport Support with Host/Port Configuration and Dockerfile#63
shemreader wants to merge 5 commits intoacryldata:mainfrom
shemreader:main

Conversation

@shemreader
Copy link

@shemreader shemreader commented Dec 8, 2025

Summary
This PR adds HTTP transport support for remote MCP server deployments, enabling the DataHub MCP server to run as a standalone HTTP service accessible over the network.

Changes

  1. HTTP Transport with Configurable Host/Port (main.py)
    Added --host option (default: 0.0.0.0) to bind the server to a specific network interface
    Added --port option (default: 8000) to configure the listening port
    HTTP transport runs in stateless mode for better scalability

  2. Dockerfile for Containerized Deployment
    Added multi-stage Dockerfile using python:3.10-slim base image
    Uses uv for fast, reproducible dependency installation
    Exposes port 8000 by default


Note

Adds configurable host/port for HTTP/SSE transports and introduces a Dockerfile to run the server over HTTP on port 8000.

  • Server/runtime:
    • Extend src/mcp_server_datahub/__main__.py with --host and --port options; pass to mcp.run() for http and sse transports.
    • Enable stateless HTTP mode (stateless_http=True) for http transport.
    • Log a warning if --host/--port are provided with stdio transport.
  • Containerization:
    • Add Dockerfile using python:3.10-slim and uv for deps; copy project files, uv sync --frozen --no-dev.
    • Expose 8000 and set CMD to run mcp-server-datahub with --transport http --host 0.0.0.0 --port 8000.

Written by Cursor Bugbot for commit d813bb6. This will update automatically on new commits. Configure here.

Your Name and others added 4 commits November 29, 2025 16:33
- Add --host and --port CLI options for remote deployment
- Create Dockerfile for containerization
- Support remote HTTP streaming transport for Kubernetes deployments
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the final PR Bugbot will review for you during this billing cycle

Your free Bugbot reviews will reset on December 27

Details

Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

show_banner=False,
host=host,
port=port,
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: SSE binds to all interfaces by default

The new --host default of 0.0.0.0 is applied to sse (and http) runs, so selecting sse now exposes the server on all network interfaces by default. This is a behavior change from typical localhost defaults and can unintentionally make a local dev server remotely reachable.

Additional Locations (1)

Fix in Cursor Fix in Web

Copy link
Contributor

@alexsku alexsku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this contribution. However, this PR cannot be merged as-is due to a critical security issue.

🚨 Security Issue: Missing Authentication Violates MCP Protocol

This implementation exposes the DataHub MCP server over HTTP without authentication, which violates the MCP security specification. According to the MCP Security Best Practices:

The Problem

  1. Binds to 0.0.0.0 by default (all network interfaces)
  2. No authentication - anyone who can reach the server can execute arbitrary DataHub queries
  3. All requests use the same DataHubClient.from_env() credentials regardless of who makes them
  4. Exposes sensitive metadata: schemas, lineage, PII-tagged data, usage patterns

This is unacceptable for a general-purpose tool like DataHub MCP server.

✅ The Right Approach: Library Pattern

Instead of adding HTTP transport to the CLI, we should expose DataHub tools as a library that users can register in their own FastMCP services with their own authentication.

Architecture

DataHub package (mcp-server-datahub):

  • Provides register_datahub_tools() function
  • CLI remains simple (stdio only, local development)
  • Published to PyPI as before

User's service (in their own codebase):

  • Creates FastMCP instance with authentication
  • Calls register_datahub_tools() to register DataHub tools
  • Deploys with their own security requirements

Implementation

1. Export registration function:

# src/mcp_server_datahub/__init__.py
from mcp_server_datahub.mcp_server import register_datahub_tools

__all__ = ["register_datahub_tools"]

2. Refactor register_all_tools() to accept FastMCP instance:

# src/mcp_server_datahub/mcp_server.py
def register_datahub_tools(
    mcp_instance: FastMCP,
    datahub_client: DataHubClient,
    is_oss: bool = False
) -> None:
    """Register DataHub MCP tools on a user-provided FastMCP instance.
    
    Args:
        mcp_instance: FastMCP instance to register tools on
        datahub_client: DataHub client for API access
        is_oss: Whether to use OSS-compatible tool descriptions
    """
    set_datahub_client(datahub_client)
    
    mcp_instance.tool(name="search", description=...)(async_background(search))
    mcp_instance.tool(name="get_lineage", description=...)(async_background(get_lineage))
    # ... register all tools on provided instance

3. Users create their own authenticated service:

# User's codebase: my_company_mcp/main.py
from fastmcp import FastMCP
from fastmcp.server.auth.providers.jwt import JWTVerifier
from datahub.sdk.main_client import DataHubClient
from mcp_server_datahub import register_datahub_tools

# User controls authentication
auth = JWTVerifier(
    jwks_uri="https://auth.mycompany.com/.well-known/jwks.json",
    issuer="https://auth.mycompany.com",
    audience="mycompany-mcp"
)

mcp = FastMCP(name="MyCompany MCP Server", auth=auth)

# Register DataHub tools
client = DataHubClient.from_env()
register_datahub_tools(mcp, client, is_oss=True)

# Run with authentication
if __name__ == "__main__":
    mcp.run(transport="http", host="0.0.0.0", port=8000)

4. Users deploy their own service:

# User's Dockerfile
FROM python:3.10-slim
WORKDIR /app

COPY requirements.txt .
RUN pip install -r requirements.txt  # Includes mcp-server-datahub

COPY my_company_mcp/ ./my_company_mcp/
CMD ["python", "-m", "my_company_mcp.main"]

Why This Is Better

  • Security: Users implement authentication in their environment
  • Flexibility: Users control deployment, middleware, and configuration
  • Composability: Users can combine DataHub tools with their own tools
  • Compliance: Follows MCP specification and FastMCP best practices
  • Separation of concerns: DataHub provides tools, users handle deployment security

Required Changes

  1. Refactor to library:

    • Change register_all_tools()register_datahub_tools(mcp_instance, client, is_oss)
    • Export from __init__.py
    • Internal __main__.py calls the new function (no other changes needed)
  2. Remove from this PR:

    • Drop --host and --port CLI options (not needed - users write their own service)
    • Drop Dockerfile (users create their own with authentication)
  3. Documentation:

    • Add "Library Usage" guide with authentication examples
    • Show JWT, OAuth, and other auth provider examples
    • Link to FastMCP authentication docs

Publishing

No package rename needed. mcp-server-datahub will work as both CLI and library:

# Install once
pip install mcp-server-datahub

# Use as CLI (local development, stdio)
mcp-server-datahub

# Use as library (production, HTTP with auth)
from mcp_server_datahub import register_datahub_tools

References

@elad-bar
Copy link

hi @alexsku,

my 2c... if the concern is about http - that's policy of the specific organization whether to use it or not,
if the concern is about having env var for holding the parameter, that's already happining today, just for local without supporting remote mcp.

btw, many MCP Gateways solution today handling the access for the http by the user using jwt token that eliminates the need for having MCP authentication as not all MCP servers supports the standard you are demanding, same as this one doesn't support it,
they are create pod per user per technology, with access rights segregated by the token or IP -
so... critical - it is not, as there is workaround which is kind of standard and approved by many large organization cybersecurity departments.

your suggestion making lots of sense but, that's on top of supporting remote mcp which this PR by @shemreader is about.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants