Genie tool uses MCP Adapters under the hood#256
Genie tool uses MCP Adapters under the hood#256nisha2003 wants to merge 53 commits intomcp-migrationfrom
Conversation
src/databricks_ai_bridge/genie.py
Outdated
| if not tools: | ||
| raise ValueError(f"No tools found in Genie MCP server for space {space_id}") | ||
|
|
||
| query_tools = [tool for tool in tools if "query" in tool.name.lower()] |
There was a problem hiding this comment.
can we do something like this instead since we know it's only 1 tool?
# Use next() with a generator:
self._query_tool_name = next(
(tool.name for tool in tools if tool.name.startswith("query_space_")),
None
)
self._poll_tool_name = next(
(tool.name for tool in tools if tool.name.startswith("poll_response_")),
None
)
| ] | ||
|
|
||
| [tool.uv.sources] | ||
| databricks-mcp = { path = "databricks_mcp/", editable = true } |
There was a problem hiding this comment.
what is this for? i think for dev, we can manually do local installs of databricks-mcp instead of this
does this always try to install from local instead of pypi? looks like it's only for local work, and on pypi, this won't apply -- if that's right, we should do this for all of our subpackage definitions to rely on the local installs when possible for dev
There was a problem hiding this comment.
added for the other subpackages!
bbqiu
left a comment
There was a problem hiding this comment.
overall looks great! left a few comments
6864e57 to
f8a361f
Compare
src/databricks_ai_bridge/genie.py
Outdated
| text_attachments = content.get("textAttachments", []) | ||
|
|
||
| if query_attachments: | ||
| first_query = query_attachments[0] |
There was a problem hiding this comment.
just curious here, do we always return only one query attachment?
There was a problem hiding this comment.
Addressed in comment below.
|
|
||
| if query_attachments: | ||
| first_query = query_attachments[0] | ||
| query_str = first_query.get("query", "") |
There was a problem hiding this comment.
we are adding the query_str here even if the statement response failed. Is that ok? Is that what we do currently?
There was a problem hiding this comment.
Yep, I think the old code did this (snippet below of the old poll_query_results). I imagine this is so that even if a SQL query fails there's some use in sending it back for debugging?
if state == "SUCCEEDED":
result = _parse_query_result(resp, self.truncate_results, self.return_pandas)
return GenieResponse(result, query_str, description, returned_conversation_id)
elif state in ["RUNNING", "PENDING"]:
logging.debug("Waiting for query result...")
time.sleep(ITERATION_FREQUENCY)
else:
return GenieResponse(
f"No query result: {resp['state']}",
query_str,
description,
returned_conversation_id,
)
src/databricks_ai_bridge/genie.py
Outdated
| @@ -206,154 +349,148 @@ def create_message(self, conversation_id, content): | |||
|
|
|||
| @mlflow.trace() | |||
| def poll_for_result(self, conversation_id, message_id): | |||
There was a problem hiding this comment.
If we are keeping start_conversation/create_message and just adding a deprecated, should we keep original poll_for_result, mark as deprecated and then create a new poll for ask_question
There was a problem hiding this comment.
I think the old poll_for_result is used by ask_question. I guess it's internally used but the pattern of using start_conversation --> create_message --> poll_for_result is deprecated. Since on the user side it shouldn't be used I can mark poll_for_result to deprecated as well.
I feel like we shouldn't need a new poll for ask_question right? Let me know if I'm missing smth.
There was a problem hiding this comment.
Added an internal poller that is called by ask_question and poll_for_result.
| statement_response | ||
| and statement_response.get("status", {}).get("state") == "SUCCEEDED" | ||
| ): | ||
| result = _parse_query_result(statement_response, truncate_results, return_pandas) |
There was a problem hiding this comment.
There are some cases where we return both query_attachments and text_attachments. Previously we were returning all attachments to the user. Now we will only be returnign query_attachements. Is this a conscious decision?
If not maybe its better to also return text_attachments here if query_attachments exist. Text attachments are usually what genie summarizes and basically is an answer to the user.
There was a problem hiding this comment.
Hm okay -- I did this because the old code had the following logic. It's also I took only one query attachment. If we want to change this to send all the attachments we probably need to update GenieResponse. Is this okay? I imagine it is since we'll just feed it to the agent + will have more information but let me know what you think.
if current_status == "COMPLETED": attachment = next( (r for r in resp["attachments"] if "query" in r), None ) if attachment: query_obj = attachment["query"] description = query_obj.get("description", "") query_str = query_obj.get("query", "") attachment_id = attachment["attachment_id"] return poll_query_results( attachment_id, query_str, description, returned_conversation_id, ) if current_status == "COMPLETED": text_content = next(r for r in resp["attachments"] if "text" in r)[ "text" ]["content"] return GenieResponse( result=text_content, conversation_id=returned_conversation_id, )
Migrates the Genie class to use MCP (Model Context Protocol) adapters, replacing direct REST API calls in
poll_for_resultandask_question.Updated unit tests.
Manual test here: https://e2-spog.staging.cloud.databricks.com/editor/notebooks/2260046326096166?o=6051921418418893