Skip to content

reachscan audit: subprocess execution and credential access reachable from dbt_compile and dbt_model_analyzer_agent #648

@vinmay

Description

@vinmay

Hi dbt team,

I've been building reachscan, a static analysis CLI for MCP servers, and ran it across 50 repos as part of a public audit. Here are the reachability findings for dbt-mcp.

Scanned: 103 files, 3 Python MCP entry points detected.

Reachable findings (12 total)

From dbt_compile (dbt_compile.py:18):

Finding File:Line Capability Reachability path
subprocess.run() dbt_compile.py:34 EXECUTE dbt_compile → direct call
dotenv.load_dotenv() (env setup) SECRETS dbt_compile → direct call
os.getenv("DBT_PROJECT_LOCATION") (env setup) SECRETS dbt_compile → direct call
os.getenv("DBT_EXECUTABLE") (env setup) SECRETS dbt_compile → direct call

From dbt_model_analyzer_agent (dbt_model_analyzer.py:34):

Finding File:Line Capability Reachability path
subprocess.run() dbt_model_analyzer.py:147 EXECUTE dbt_model_analyzer_agentgather_dbt_project_info
subprocess.run() dbt_model_analyzer.py:161 EXECUTE dbt_model_analyzer_agentgather_dbt_project_info
subprocess.run() dbt_model_analyzer.py:175 EXECUTE dbt_model_analyzer_agentgather_dbt_project_info
dotenv.load_dotenv() (env setup) SECRETS dbt_model_analyzer_agent → direct call
os.getenv("DBT_PROJECT_LOCATION") (env setup) SECRETS dbt_model_analyzer_agent → direct call
os.getenv("DBT_EXECUTABLE") (env setup) SECRETS dbt_model_analyzer_agent → direct call
open(..., 'r') dbt_model_analyzer.py:135 READ dbt_model_analyzer_agentgather_dbt_project_info

48 additional findings are unreachable — buried in module-level code or utility paths not called from any MCP entry point.

Observations

  1. DBT_EXECUTABLE subprocess execution — the subprocess.run() calls at dbt_model_analyzer.py:147/161/175 appear to invoke the dbt CLI binary. If DBT_EXECUTABLE is configurable from env or the MCP call, an attacker with MCP access could point it at an arbitrary binary. Worth confirming the executable path is validated/pinned.

  2. Credential access scopedotenv.load_dotenv() loads whatever is in the .env file at startup. The variables read (DBT_PROJECT_LOCATION, DBT_EXECUTABLE) look config-related rather than secret, but if the .env also contains DB credentials or cloud keys, those are loaded into the process environment accessible to any LLM call.

  3. File readsopen(..., 'r') at dbt_model_analyzer.py:135, called via gather_dbt_project_info, reads project files. Is the path parameter validated against the expected project directory?

These capabilities are expected for a dbt MCP integration — just flagging them so users can make an informed decision about their threat model before running this in an automated pipeline.

Full audit write-up and methodology: https://medium.com/@vinmayN/i-scanned-50-mcp-servers-to-see-what-they-can-actually-do-46144659ceca

Is any of this a false positive, or does the design already account for these paths? Happy to share the raw JSON scan output.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions