-
Notifications
You must be signed in to change notification settings - Fork 21
Add Structured Output Support #36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
toshihikoyanase
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on the structured output support. The PR is not ready for review, but I have an early comment on the class names.
pyproject.toml
Outdated
| "plotly>=6.0.1", | ||
| "torch>=2.7.0", | ||
| "bottle>=0.13.4", | ||
| "fastmcp>=2.12.2", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Question] Do we need to migrate the mcp library from the official mcp library to fastmcp to support structured output? mcp==1.14.0 seems to support the structured output feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for pointing it out. I’ve updated the MCP version to mcp[cli]>=1.10.0 and removed fastmcp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your update. I've confirmed that the mcp package supports the structured output since 1.10.0 in https://pypi.org/project/mcp/1.10.0/ and https://github.com/modelcontextprotocol/python-sdk/releases/tag/v1.10.0. Since the current implementation uses Pydantic, we may need to apply the fix modelcontextprotocol/python-sdk#1099, which was included in mcp==1.11.0. I'll check the behavior and report results soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Notes] Ah, the change does not contain any field aliases. I believe that modelcontextprotocol/python-sdk#1099 is not required for optuna mcp.
Deisgn Choices
|
a1abca0 to
e001fd5
Compare
According to the official Python SDK document, defining return types in Pydantic models is supported. This is also demonstrated in the example code, so using this approach seems reasonable. An alternative method mentioned is using dataclasses, as the existing Regarding the change in return values, since optuna-mcp is still in alpha version, such changes are acceptable. I will make sure to verify if existing examples continue to work as expected.
According to the specification 2025-06-18, the image content is distinguished from structured content. So, the current signature for the |
|
Thank you for the discussions. Everything should now be updated. PTAL. |
|
I’ve also tested this PR locally with some existing examples using Copilot (Optimizing the 2D-Sphere function, Optuna dashboard, Cookie Recipe, and Matplotlib Configuration). It would be greatly appreciated if the PR could be verified it in a different environment as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your update and confirmation using Copilot.
I also checked the behavior using Claude Desktop with some examples such as sphere2d, cookie recipes, and they worked as expected.
I'd like to discuss error handling methods and the types of return values. Please take a look.
optuna_mcp/server.py
Outdated
| return CallToolResult( | ||
| content=[TextContent(type="text", text="No storage specified.")], isError=True | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using CallToolResult to return error information to MCP clients, we may be able to use the MCP SDK's error-handling feature.
For instance, the following PR adds implementation that catches mcp.McpError and returns them as responses:
modelcontextprotocol/python-sdk#62
Based on this approach, the code will be like this:
| return CallToolResult( | |
| content=[TextContent(type="text", text="No storage specified.")], isError=True | |
| ) | |
| raise McpError(ErrorData(code=INTERNAL_ERROR, message="No storage specified.")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type of return values will be simplified from list[StudyResponse] | CallToolResult to list[StudyResponse].
This will result in reducing the OutputSchema returned to the client significantly.
This PR(569 lines): get_all_study_name_output_schema_call_tool_result.json
McpError(66 lines): get_all_study_name_output_schema_mcp_error.json
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the detailed investigation. I have some concerns regarding the intended usage of MCPError. Based on my understanding, MCPError is primarily designed for protocol-level errors (as suggested here), and may not be appropriate in this context. Since using CallToolResult for the error makes the return type ambiguous, I started to think that simply raising a RuntimeError would be sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your suggestion. I checked the current implementation of the Python SDK regarding the RuntimeError handling.
https://github.com/modelcontextprotocol/python-sdk/blob/71889d7387f070cd872cab7c9aa3d1ff1fa5a5d2/src/mcp/server/lowlevel/server.py#L687-L704
It seems to handle exceptions other than McpError by automatically returning an ErrorData response as follows:
response = types.ErrorData(code=0, message=str(err), data=None)
await message.respond(response)In this case, error code 0 is used. However, the JSON-RPC 2.0 specification does not assign code 0; it is considered an unassigned value.
When catching a McpError, the ErrorData object contained within it is returned directly. For example:
err = McpError(ErrorData(code=INTERNAL_ERROR, message="No storage specified."))
response = err.error
await message.respond(response)So, McpError allows servers to return specific error codes, making it a more client-friendly implementation.
Regarding the documentation in your message, it seems to mention JSON-RPC Error Codes, and InternalError (-32603) appears to be a protocol-specified internal error.
The following server implementation also uses the McpError exception:
https://dev.to/yigit-konur/error-handling-in-mcp-typescript-sdk-2ol7#server-error-handling
throw new McpError(
ErrorCode.InternalError,
`Error validating elicitation response: ${error}`,
);There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for the clarification. I understand now and will proceed with the necessary revisions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When revising the tests, I noticed FastMCP wraps all errors in ToolError (ref), so in Optuna-MCP usage, there seems to be no practical difference between a McpError and a regular exception.
Still, assuming FastMCP may expose error codes in the future, I’ve implemented errors with McpError. I’d appreciate your thoughts. PTAL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When revising the tests, I noticed FastMCP wraps all errors in ToolError (ref), so in Optuna-MCP usage, there seems to be no practical difference between a McpError and a regular exception.
Thank you for your detailed investigation. I didn't realize that FastMCP included such additional error handling.
Still, assuming FastMCP may expose error codes in the future, I’ve implemented errors with McpError.
I agree. While FastMCP's error handling may currently have some potential issues (e.g., modelcontextprotocol/python-sdk#698), I believe these will be addressed in future improvements. We can revisit the error handling design after some updates to the official SDK.
toshihikoyanase
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your update. I found the current PR might have several items that likely require design discussion.
I don't think we need to resolve them in this PR, so could you add your feedback to the comments, please?
optuna_mcp/server.py
Outdated
| class StudyResponse(BaseModel): | ||
| study_name: str | ||
| sampler_name: ( | ||
| typing.Literal["TPESampler", "NSGAIISampler", "RandomSampler", "GPSampler"] | None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Notes] A nit, but supported samplers are listed in the definition of set_sampler as well. We may define SamplerName=typing.Literal["TPESampler", "NSGAIISampler", "RandomSampler", "GPSampler"] to prevent the inconsistency.
But we can work on it in a follow-up PR.
optuna_mcp/server.py
Outdated
| sampler_name: ( | ||
| typing.Literal["TPESampler", "NSGAIISampler", "RandomSampler", "GPSampler"] | None | ||
| ) = Field(default=None, description="The name of the sampler used in the study.") | ||
| directions: list[str] | None = Field( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type of the directions attribute is not consistent with the argument of create_study.
| directions: list[str] | None = Field( | |
| directions: list[typing.Literal["minimize", "maximize"]] | None = Field( |
| ) | ||
|
|
||
| @mcp.tool() | ||
| @mcp.tool(structured_output=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Notes] Since metric names correspond to the study's directions, we might be able to include the response in StudyResponse as follows:
StudyResponse:
...
metric_names: | None = Field(default=None, description="...")But this new approach may need further discussion, so we can work on it in a follow-up PR.
| return f"metric_names set to {json.dumps(metric_names)}" | ||
|
|
||
| @mcp.tool() | ||
| @mcp.tool(structured_output=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto.
optuna_mcp/server.py
Outdated
| ) | ||
|
|
||
| @mcp.tool() | ||
| @mcp.tool(structured_output=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative approach is to return the list of TrialResponse.
So, how about keeping the structured_output=False for future updates?
| @mcp.tool(structured_output=True) | |
| @mcp.tool() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation of MCP sets the default value of structured_output to None, which automatically chose the return type (either structured or unstructured). In the case of get_trials, structured output is returned if structured_output is not specified.
To preserve the current behavior of main for future updates, I explicitly set structured_output=False. This also simplifies the test cases, as mentioned in this discussion.
| TrialResponse( | ||
| trial_number=trial.number, | ||
| params=trial.params, | ||
| values=trial.values, | ||
| user_attrs=trial.user_attrs, | ||
| system_attrs=trial.system_attrs, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I found that the proposed best_trial returns more information, such as user_attrs and system_attrs, than the previous implementation, while the proposed best_trials returns less information, such as distributions, datetime_start, datetime_end, and intermediate_values.
This PR did not introduce this gap between best_trial and best_trials, and we can address this issue in a follow-up PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for pointing that out. At the moment, I don’t have a strong preference regarding what should be returned in best_trial and best_trials. This issue will be addressed in a follow-up PR.
| return Image(data=plotly.io.to_image(fig), format="png") | ||
|
|
||
| @mcp.tool() | ||
| @mcp.tool(structured_output=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, we can define the response type for the optuna dashboard, such as:
class OptunaDashboardResponse(BaseModel):
url: str = Field(description="The URL of the Optuna dashboard.")But this lacks information on whether the server has been newly started or not, so we can discuss this approach in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defining a response type for the Optuna dashboard sounds like a good idea. This will also be addressed in a follow-up PR.
| assert len(result) == 2 | ||
| assert isinstance(result, Sequence) | ||
| assert isinstance(result[0], list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Notes] These lines for the type checking are necessary but verbose. Perhaps we could explore ways to simplify the test cases in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I totally agree with your point. But to keep this PR as focused and simple as possible, this issue will be addressed in a follow-up PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! That makes sense.
tests/test_server.py
Outdated
| for user_attrs in (user_attrs_from_text, user_attrs_from_dict): | ||
| r = json.loads(":".join(user_attrs.split(":")[1:]).strip()) | ||
| assert isinstance(r, list) | ||
| assert r == metric_names |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I felt the output data was not structured when I saw these lines for the result parsing. Not a strong opinion, but we might include the metric names in StudyResponse as I mentioned above.
tests/test_server.py
Outdated
| lines_from_text = result[0][0].text.strip().split("\n") | ||
| lines_from_dict = result[1]["result"].strip().split("\n") | ||
|
|
||
| for lines in (lines_from_text, lines_from_dict): | ||
| assert len(lines) == 3 | ||
| assert lines[0] == ("Trials: ") | ||
| assert lines[1].startswith(",number") | ||
| assert lines[2].startswith("0,0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enabling the structured_output at mcp.tool might make the output more complicated than the current main.
a35a6d6 to
6518c7b
Compare
|
Thank you for all the comments and suggestions. Some points have been left as they are and will be discussed in follow-up PRs, while others have been updated in this PR. PTAL. |
toshihikoyanase
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your update. I confirmed that the structured output worked as expected. We found some unresolved issues during the review process, so we'll work on them in follow-up PRs. LGTM!
Motivation
This PR introduces structured output support to enhance the safety and clarity of communication between the MCP server and clients.
Description of the Changes
StudyInfoandTrialInfo, and applied them as structured return types when the output conforms to these formats.pyproject.tomlto enable structured output.