Commit c9bf758
Teradata lineage and dependency graph analysis (#302)
* Added graph_queryDependenciesAgent MCP tool for Teradata object dependency analysis
Introduce a new MCP tool that provides comprehensive object dependency analysis
for Teradata databases with support for wildcards, CSV patterns, and bidirectional
dependency traversal.
Key Features:
- Analyses upstream dependencies (what an object depends on) and downstream
dependencies (what depends on the object)
- Supports single objects, wildcard patterns (%), and CSV pattern lists
- Configurable traversal depth for both upstream (max_depth_up) and downstream
(max_depth_down) analysis (0-10 levels)
- Server-side filtering with exclude_objects and include_containers parameters
- Returns dependency graph as nodes and edges for visualisation
- Multiple output formats: 'detailed', 'summary', 'edges_only'
Use Cases:
- Impact analysis: Determine blast radius before dropping/changing objects
- Data lineage tracing: Track upstream data sources
- Dependency discovery: Understand object relationships
- Pre-deployment validation: Assess impacts before changes
- Documentation: Map database object dependencies
Parameters:
- object_name (required): Object pattern(s) - supports wildcards and CSV
Examples: 'DB.Table', '%WBC%.%', 'DB1.T1,DB2.T2'
- max_depth_up (default: 3): Upstream traversal depth (0-10)
- max_depth_down (default: 3): Downstream traversal depth (0-10)
- exclude_objects (default: ''): CSV patterns to exclude from analysis
- include_containers (default: ''): Whitelist of schemas/databases
- edge_repository (default: 'DEV_01_ODEX_STD_0_V.ODEXRepository'):
ODEX repository table
- return_format (default: 'detailed'): Output format
Technical Implementation:
- Leverages ODEX repository for dependency metadata
- Uses STRTOK_SPLIT_TO_TABLE for server-side CSV parsing
- Automatic whitespace trimming of patterns
- Returns formatted response with dependency graph and metadata
- Performance optimised with proper exclusion patterns (20-50% reduction)
Example Usage:
graph_queryDependenciesAgent(
object_name="%WBC%.%,%StGeo%.%",
max_depth_up=5,
exclude_objects="PRD_%,TST_%"
)
BREAKING CHANGE: None - new feature addition
* Replaced QueryDependenciesAgent, added findRootObjects, added graph_detectCycles, graph_connectedComponents and _graph_bfsLevels
replace QueryDependenciesAgent with QueryDependenciesAgentBatch (better performance).
Added findRootObjects to find source objects to start analysing downstream graphs
Added graph_detectCycles to identify circular references
Add graph_connectedComponents to identify groups of connected component (groups of closely related objects)
And Added graph_bfsLevels using a Breadth First Search for use in Object Migration Wave planning
* feat: add graph analysis tools - analyseDatabase, bfsLevels, connectedComponents, detectCycles, findRootObjects, edgeContract
- Replaced monolithic queryDependenciesAgent with modular graph tools
- Added _graph_utils shared utility module
- Removed graph_prompts.yml and legacy documentation
- Updated app.py and profiles.yml for graph tool registration
* refactor(graph): compliance pass, contract v1.1, helper consolidation
refactor(graph): compliance pass, contract v1.1, helper consolidation
BREAKING CHANGES
- graph_queryDependenciesAgent renamed to graph_traceLineage (file,
function, constant, tool name string). Update any callers accordingly.
- graph_detectCycles: strategy and max_edges_for_cte parameters removed.
- graph_detectCycles, graph_connectedComponents: object_dependency_table
renamed to edge_repository; excl_patterns renamed to exclude_objects.
- graph_edgeContractDDL: generated DDL column names corrected from
SrcContainer/SrcObject/SrcKind to Src_Container_Name/Src_Object_Name/
Src_Kind (and Tgt equivalents). Previously generated tables were
incompatible with the tool SQL. Contract version bumped to 1.1.
PROGRESSIVE DISCLOSURE COMPLIANCE
- graph_tools.py: graph_analyseDatabase and graph_edgeContractDDL were
missing from GRAPH_TOOLS. All 7 tools now registered in workflow order:
edgeContractDDL → findRootObjects → bfsLevels → traceLineage →
detectCycles → connectedComponents → analyseDatabase.
- GRAPH_EDGE_CONTRACT_DDL_TOOL descriptor added to graph_edge_contract.py
(was absent entirely; tool was unregisterable in static mode).
TERMINOLOGY
- Remove all ODEX references from __init__.py and _graph_utils.py per
standing instruction. Replaced with generic terms (dependency graph,
object dependency graph).
LOGGING
- Replace all f-string logger calls with %s style throughout
graph_findRootObjects.py (5 calls), graph_bfsLevels.py (1 call), and
graph_edge_contract.py (2 calls, including logger.warning).
- Remove stray print() from graph_findRootObjects.py; replaced with
logger.debug.
PARAMETER CHANGES
- edge_repository: runtime validation added to all 6 tools that accept it.
Empty string now returns an early error with the AI-Native Data Product
convention hint ({ProductName}_Semantic.lineage_graph).
- graph_bfsLevels, graph_traceLineage, graph_detectCycles,
graph_connectedComponents: stale cross-references to
graph_queryDependenciesAgent updated to graph_traceLineage throughout
docstrings and descriptors.
GRAPH EDGE CONTRACT v1.1
- Column names corrected throughout: DDL, sample DML, view template,
COMMENT ON COLUMN, canonical contract text, file header.
- Optional enrichment columns added: Edge_Relationship VARCHAR(50),
Transformation_Type VARCHAR(50). Ignored by graph analysis tools;
present in {ProductName}_Semantic.lineage_graph for visualisation
clients. ADDITIONAL COLUMNS section updated accordingly.
- Src_Kind/Tgt_Kind COMPRESS lists expanded to cover both single-letter
codes (T, V, P...) and full-word values (Table, View, Job...) to match
lineage_graph output.
- Sample DML updated: basic examples use 6-column form; new ETL-job
example demonstrates source→job→target two-leg pattern using all 8
columns.
- View template updated: optional columns included as nullable
CAST(NULL AS VARCHAR(50)) placeholders with mapping guidance.
- AI-Native Data Product convention documented in file header, contract
text, docstring, descriptor, and all edge_repository error messages.
HELPER CONSOLIDATION (phase 1 — safe mechanical changes only)
- _graph_utils.py: add parse_csv_patterns() and build_like_or().
- Remove 7 local copies of parse_csv/_parse_csv_patterns (graph_
analyseDatabase, graph_bfsLevels, graph_detectCycles, graph_connected
Components, graph_traceLineage, graph_findRootObjects ×2); replace
with shared import.
- Remove 3 local copies of _build_like_or/_build_like_clauses (graph_
analyseDatabase, graph_detectCycles, graph_connectedComponents);
replace with shared import.
- Deferred to phase 2: _UnionFind consolidation (recursion bug in
graph_detectCycles.find()), _build_excl_* parameterisation.
* �[200~fix(graph): remove unmatched brace in graph_connectedComponents tool descriptor
GRAPH_CONNECTED_COMPONENTS_TOOL had a duplicate closing brace at line 481
in the parameters dict, causing a SyntaxError at import time. beartype's
import hook surfaced the error during package load, which caused the entire
graph package to fail silently — all seven graph tools were unregistered
with no server-side warning.
Removed the spurious at line 481.
Root cause: raw dict tool descriptors have no structural validation at
definition time. A future refactor to dataclass-based ToolDescriptor would
catch this class of error at module load rather than requiring manual
import tracing.
* fix: resolve ruff and mypy CI failures in graph tools
- Rename camelCase graph module files to snake_case (N999)
- Update all imports in __init__.py and graph_tools.py to match new names
- Lowercase WHITE/GREY/BLACK constants used as local variables (N806)
- Replace set comprehensions with set() calls (C416)
- Rename unused loop variable comp_root to _comp_root (B007)
- Remove trailing whitespace from SQL strings (W291)
- Type _parent dict as dict[str, str] in UnionFind classes (mypy no-any-return)
- Change stack type annotation from object to Iterator[str] (mypy call-overload)
- Annotate type_counts and db_counts dicts explicitly (mypy var-annotated)
- Annotate upstream/downstream_level and nearest_root_val as Optional (mypy assignment)
- Rename members to cycle_members to avoid conflicting type inference (mypy assignment)
- Annotate rows as list[dict[str, Any]] for sort key compatibility (mypy arg-type)
* style: apply ruff formatting to graph module and app.py
---------
Co-authored-by: earthshiner <paul.dancer@gmail.com>
Co-authored-by: Paul Dancer <paul.dancer@teradata.com>1 parent a214bcf commit c9bf758
16 files changed
Lines changed: 5916 additions & 12 deletions
File tree
- src/teradata_mcp_server
- config
- tools
- graph
- utils
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
| 21 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
| 37 | + | |
37 | 38 | | |
38 | 39 | | |
39 | 40 | | |
| |||
1287 | 1288 | | |
1288 | 1289 | | |
1289 | 1290 | | |
| 1291 | + | |
| 1292 | + | |
| 1293 | + | |
| 1294 | + | |
| 1295 | + | |
| 1296 | + | |
| 1297 | + | |
| 1298 | + | |
| 1299 | + | |
| 1300 | + | |
| 1301 | + | |
| 1302 | + | |
| 1303 | + | |
| 1304 | + | |
1290 | 1305 | | |
1291 | 1306 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
44 | 44 | | |
45 | 45 | | |
46 | 46 | | |
47 | | - | |
| 47 | + | |
48 | 48 | | |
49 | 49 | | |
50 | 50 | | |
| |||
60 | 60 | | |
61 | 61 | | |
62 | 62 | | |
63 | | - | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
0 commit comments