Highlights — New SQL-like Filter Syntax
The search and get_lineage tools now accept a human-readable, SQL-like filter string instead of the previous nested JSON dict. This makes filters dramatically easier for LLM agents (and humans) to write correctly.
Since MCP tools are discovered dynamically by LLM agents at the start of every session, this is not a breaking change — agents will automatically pick up the new filter parameter and its syntax documentation.
Before (0.5.x):
search(query="*", filters={"entity_type": ["DATASET"]})
search(query="*", filters={"and": [{"platform": ["snowflake"]}, {"env": ["PROD"]}]})After (0.5.3):
search(query="*", filter="entity_type = dataset")
search(query="*", filter="platform = snowflake AND env = PROD")The new syntax supports simple equality, IN lists, boolean logic (AND, OR, NOT, parentheses), comparisons (>, >=, <, <=), and existence checks (IS NULL, IS NOT NULL).
Added
search_filter_parser: Full SQL-like filter parser with tokenizer and recursive-descent parser. Compiles human-readable filter strings into DataHub SDKFilterobjects. Includes comprehensiveFILTER_DOCSinjected into tool descriptions so LLM agents always have the syntax reference.- Modular tool architecture: Tools are now organized into dedicated modules under
tools/(search.py,entities.py,lineage.py,dataset_queries.py,assertions.py) instead of being defined inline inmcp_server.py. graphql_helpers: Extracted shared GraphQL execution logic, token budgeting, and response processing into a dedicated module.tool_context: New module for tool-level context management.view_preference: Configurable view preference system (UseDefaultView,NoView,CustomView) for controlling which DataHub view is applied during search.tools/assertions.py: New tool module for data quality assertion checks.
Changed
searchtool: Thefiltersparameter (JSON dict) is replaced byfilter(string). See highlights section above.get_lineagetool: Also uses the new string-basedfilterparameter for filtering lineage results.mcp_server.py: Significantly slimmed down — tool implementations moved to dedicated modules, GraphQL helpers extracted, filter parsing extracted.- Smoke check safety:
smoke_check.pynow refuses to run against non-localhost DataHub instances to prevent accidental mutation of production data.
Removed
test_custom_filter_conversion.py: Removed obsolete test for the old dict-based filter format, replaced bysearch_filter_parser.
Full Changelog: v0.5.2...v0.5.3