Conversation
|
@kyrtapz , can we please wait for the first MR to be merged and try to get it done ASAP? its little confusing to review this. |
Hey @FrostGod!, there is no intent of getting this one in before #1 merges. I am not sure why this PR is confusing but you can see I am the author of the last commit.
That is a good idea I think @arkadeepsen was looking into it, Aurko can you confirm? |
I have created all the issues that we are working on and assigned them to the respective individuals who are working on them. |
cool, it was confusing in the sense, what all functionalities are in progress..., and how does the overall picture look like, what all layers is the MCP suppose to do, |
|
@pperiyasamy will review the PR. |
pperiyasamy
left a comment
There was a problem hiding this comment.
Tried out sosreport tooling from claude, it works great. some comments.
| } | ||
|
|
||
| // listPlugins returns a list of enabled plugins with their command counts | ||
| func listPlugins(sosreportPath string) (types.ListPluginsResult, error) { |
There was a problem hiding this comment.
nice plugin aggregation for the collected logs from sos report's manifest.json file !
what about the remaining logs collected outside of sos_commands directory.
For example need to analyze ovs-vswitchd and ovsdb-server.log logs from /var/log/openvswitch directory.
There was a problem hiding this comment.
I don't see a rason to do that as they are already there as part of sos_commands: sos_commands/openvswitch/:
journalctl_--no-pager_--unit_openvswitch
journalctl_--no-pager_--unit_openvswitch-ipsec
journalctl_--no-pager_--unit_openvswitch-nonetwork
journalctl_--no-pager_--unit_ovs-configuration
journalctl_--no-pager_--unit_ovsdb-server
journalctl_--no-pager_--unit_ovs-vswitchd
pkg/sosreport/mcp/commands.go
Outdated
|
|
||
| var result types.SearchCommandsResult | ||
| if maxResults == 0 { | ||
| maxResults = 100 |
There was a problem hiding this comment.
any specific reason for restricting it with maxResults ?
There was a problem hiding this comment.
no, I want to have a reasonable limit so AI doesn't eat too much tokens, there is still an option to modify it if needed.
| // Remove the prefix if exists | ||
| logPath = strings.TrimPrefix(logPath, "host/") | ||
|
|
||
| fullPath := filepath.Join(sosreportPath, logPath) |
There was a problem hiding this comment.
we may also have rotated log files in .gz format, shouldn't that taken care of ?
There was a problem hiding this comment.
Great point, I will try to look into that.
|
|
||
| // Report contains the sosreport details | ||
| type Report struct { | ||
| Plugins map[string]PluginDetail `json:"plugins"` |
There was a problem hiding this comment.
is it ok to ignore other parameters from the sosreport ?
There was a problem hiding this comment.
Yes, they will be skipped.
| }) | ||
| } | ||
|
|
||
| result.TotalCommands = totalCommands |
There was a problem hiding this comment.
just for my understanding: what is the reason for exposing CommandCount and TotalCommands via MCP tooling ? is that really needed ?
There was a problem hiding this comment.
I don't think it hurts, it can help the model to narrow down the search criteria if there are too many results.
arkadeepsen
left a comment
There was a problem hiding this comment.
Following are some comments from my first pass of the PR. I'll do at least one more pass.
cmd/ovnk-mcp-server/main.go
Outdated
| // Always add sosreport tools (they work offline with sosreport path parameter) | ||
| sosreportServer, err := sosreportmcp.NewMCPServer() | ||
| if err != nil { | ||
| log.Fatalf("Failed to create sosreport MCP server: %v", err) | ||
| } | ||
| log.Println("Adding sosreport tools to OVN-K MCP server") | ||
| sosreportServer.AddTools(ovnkMcpServer) | ||
|
|
There was a problem hiding this comment.
I think we should add the sosreport MCP server to the offline mode. When the MCP server is deployed in-cluster, we would still be adding all the sosreport tools, which ideally shouldn't happen
There was a problem hiding this comment.
I am ok to make it offline-mode only initially but I can imagine a scenario where a singular MCP would be used for analyzing the sosreport and interacting with a cluster.
There was a problem hiding this comment.
I think this can be achieved by supporting another value (say both) for the mode flag. When this mode is chosen, both live-cluster and offline mode tools are added to the MCP server.
pkg/sosreport/mcp/logs.go
Outdated
| file, err := os.Open(fullPath) | ||
| if err != nil { | ||
| return "", err | ||
| } | ||
|
|
||
| matches, err := readWithLimit(file, searchPattern, maxResults-totalMatches) | ||
| if err != nil { | ||
| file.Close() | ||
| return "", err | ||
| } | ||
|
|
||
| if len(matches) > 0 { | ||
| result.WriteString(fmt.Sprintf("\n=== %s ===\n", logPath)) | ||
| result.WriteString(matches) | ||
| totalMatches += len(strings.Split(matches, "\n")) | ||
|
|
||
| if totalMatches >= maxResults { | ||
| result.WriteString(fmt.Sprintf("\n... (search truncated at %d results)\n", maxResults)) | ||
| file.Close() | ||
| return result.String(), nil | ||
| } | ||
| } | ||
| file.Close() |
There was a problem hiding this comment.
Would it make sense to create a separate function (or even a local function) so that the file.Close() can be added to defer? This would help in better maintainability of the code for future changes.
pkg/sosreport/mcp/mcp.go
Outdated
| - sosreport_path (required): Path to extracted sosreport directory | ||
| - filepath (required): Relative filepath from sos-list-commands or sos-search-commands | ||
| - pattern (optional): Regex pattern to filter output lines | ||
| - max_lines (optional): Maximum lines to return (default: 1000) |
There was a problem hiding this comment.
Default max lines is defined as 100 here: 44e29f0#diff-5560bb735dc6caa00cb0a83c05dd9e118b4ac6923d4e15a62919d96c3749ddcfR11.
pkg/sosreport/mcp/mcp.go
Outdated
| "plugins": [ | ||
| {"name": "crio", "command_count": 15}, | ||
| {"name": "openvswitch", "command_count": 187}, | ||
| {"name": "neworkmanager", "command_count": 2} |
There was a problem hiding this comment.
| {"name": "neworkmanager", "command_count": 2} | |
| {"name": "networkmanager", "command_count": 2} |
nit
pkg/sosreport/mcp/mcp.go
Outdated
| func NewMCPServer() (*MCPServer, error) { | ||
| return &MCPServer{}, nil | ||
| } |
There was a problem hiding this comment.
Do we need the error as a return type since only nil is being returned here? Do you feel we might need this later on?
pkg/sosreport/mcp/logs.go
Outdated
| } | ||
|
|
||
| if maxResults == 0 { | ||
| maxResults = 100 |
There was a problem hiding this comment.
We can create const for default max results here.
| } | ||
|
|
||
| // AddTools registers sosreport tools with the MCP server | ||
| func (s *MCPServer) AddTools(server *mcp.Server) { |
There was a problem hiding this comment.
We should add tool(s) for interacting with the artifacts from var/lib/ovn-ic/etc directory where OVN DBs are located along with the libovsdb log files.
There was a problem hiding this comment.
var/lib/ovn-ic/etc is openshift specific and to interact with the DBs we would have to introduce a DB client.
While I think this might be useful in the future I don't think it should be done as part of this PR, especially considering we might get significant overlap with must-gather mcp.
ovs logs are already available through journal commands.
| return "", err | ||
| } | ||
|
|
||
| fullPath := filepath.Join(sosreportPath, relativeFilepath) |
There was a problem hiding this comment.
The relative path should be validated so that it doesn't contain patterns like ../, ../../, etc. for traversing file path outside of the sosreport path.
| } | ||
|
|
||
| for _, cmd := range plugin.Commands { | ||
| result.Commands = append(result.Commands, types.CommandSummary{ |
There was a problem hiding this comment.
every sosreport tools return the output with content and structuredContent which are the same contents. is this specific design? for example
{
"content": [
{
"type": "text",
"text": "{\"output\":\" cookie=0x304, duration=13656.724s, table=0, n_packets=0, n_bytes=0, idle_age=13656, priority=700,icmp,in_port=2,nw_dst=10.0.0.4,icmp_type=3,icmp_code=4 actions=drop\\n
...
}
],
"structuredContent": {
"output": " cookie=0x304, duration=13656.724s, table=0, n_packets=0, n_bytes=0, idle_age=13656, priority=700,icmp,in_port=2,nw_dst=10.0.0.4,icmp_type=3,icmp_code=4 actions=drop\n
...
}
There was a problem hiding this comment.
Great observation!
Is this specific to the sosreport though? I don't think I am doing anything different and I would just brush it as mcp-sdk quirk.
There was a problem hiding this comment.
right, it's not specific to sosreport. It should be mcp-sdk designation^^
| } | ||
| } | ||
| if totalMatches == 0 { | ||
| return fmt.Sprintf("No matches found for pattern: %s\n", pattern), nil |
There was a problem hiding this comment.
do we support case-insensitive fuzzy search?
There was a problem hiding this comment.
The search is case sensitive but the model can easily make it insensitive with a proper regex filter.
1a9e505 to
988eff5
Compare
arkadeepsen
left a comment
There was a problem hiding this comment.
Apart from the below comments mostly looks good.
Should we add a section in the README file to document how to run the MCP server in offline mode?
pkg/sosreport/mcp/manifest_test.go
Outdated
| func TestListPlugins(t *testing.T) { | ||
| tests := []struct { | ||
| name string | ||
| path string | ||
| wantError bool | ||
| wantPluginCount int | ||
| wantTotalCmds int | ||
| }{ | ||
| { | ||
| name: "valid sosreport", | ||
| path: sosreportTestData, | ||
| wantError: false, | ||
| wantPluginCount: 3, // openvswitch, networking, container_log | ||
| wantTotalCmds: 3, // 2 from openvswitch, 1 from networking, 0 from container_log | ||
| }, | ||
| { | ||
| name: "non-existent path", | ||
| path: "testdata/non-existent", | ||
| wantError: true, | ||
| }, | ||
| } | ||
|
|
||
| for _, tt := range tests { | ||
| t.Run(tt.name, func(t *testing.T) { | ||
| result, err := listPlugins(tt.path) | ||
| if tt.wantError { | ||
| if err == nil { | ||
| t.Errorf("listPlugins() expected error but got nil") | ||
| } | ||
| return | ||
| } | ||
| if err != nil { | ||
| t.Errorf("listPlugins() unexpected error = %v", err) | ||
| return | ||
| } | ||
|
|
||
| if len(result.Plugins) != tt.wantPluginCount { | ||
| t.Errorf("listPlugins() got %d plugins, want %d", len(result.Plugins), tt.wantPluginCount) | ||
| } | ||
|
|
||
| if result.TotalCommands != tt.wantTotalCmds { | ||
| t.Errorf("listPlugins() got %d total commands, want %d", result.TotalCommands, tt.wantTotalCmds) | ||
| } | ||
|
|
||
| // Verify plugin names | ||
| pluginNames := make(map[string]bool) | ||
| for _, p := range result.Plugins { | ||
| pluginNames[p.Name] = true | ||
| } | ||
| expectedPlugins := []string{"openvswitch", "networking", "container_log"} | ||
| for _, expected := range expectedPlugins { | ||
| if !pluginNames[expected] { | ||
| t.Errorf("listPlugins() missing expected plugin %q", expected) | ||
| } | ||
| } | ||
| }) | ||
| } | ||
| } | ||
|
|
||
| func TestListCommands(t *testing.T) { | ||
| tests := []struct { | ||
| name string | ||
| path string | ||
| pluginName string | ||
| wantError bool | ||
| errorMsg string | ||
| wantCmdCount int | ||
| }{ | ||
| { | ||
| name: "openvswitch plugin", | ||
| path: sosreportTestData, | ||
| pluginName: "openvswitch", | ||
| wantError: false, | ||
| wantCmdCount: 2, | ||
| }, | ||
| { | ||
| name: "networking plugin", | ||
| path: sosreportTestData, | ||
| pluginName: "networking", | ||
| wantError: false, | ||
| wantCmdCount: 1, | ||
| }, | ||
| { | ||
| name: "container_log plugin with no commands", | ||
| path: sosreportTestData, | ||
| pluginName: "container_log", | ||
| wantError: false, | ||
| wantCmdCount: 0, | ||
| }, | ||
| { | ||
| name: "non-existent plugin", | ||
| path: sosreportTestData, | ||
| pluginName: "non-existent", | ||
| wantError: true, | ||
| errorMsg: "plugin \"non-existent\" not found", | ||
| }, | ||
| } | ||
|
|
||
| for _, tt := range tests { | ||
| t.Run(tt.name, func(t *testing.T) { | ||
| result, err := listCommands(tt.path, tt.pluginName) | ||
| if tt.wantError { | ||
| if err == nil { | ||
| t.Errorf("listCommands() expected error but got nil") | ||
| } | ||
| return | ||
| } | ||
| if err != nil { | ||
| t.Errorf("listCommands() unexpected error = %v", err) | ||
| return | ||
| } | ||
|
|
||
| if result.Plugin != tt.pluginName { | ||
| t.Errorf("listCommands() got plugin %q, want %q", result.Plugin, tt.pluginName) | ||
| } | ||
|
|
||
| if result.CommandCount != tt.wantCmdCount { | ||
| t.Errorf("listCommands() got %d commands, want %d", result.CommandCount, tt.wantCmdCount) | ||
| } | ||
|
|
||
| if len(result.Commands) != tt.wantCmdCount { | ||
| t.Errorf("listCommands() got %d command entries, want %d", len(result.Commands), tt.wantCmdCount) | ||
| } | ||
|
|
||
| // For plugins with commands, verify structure | ||
| if tt.wantCmdCount > 0 { | ||
| for _, cmd := range result.Commands { | ||
| if cmd.Exec == "" { | ||
| t.Errorf("listCommands() command has empty Exec field") | ||
| } | ||
| if cmd.Filepath == "" { | ||
| t.Errorf("listCommands() command has empty Filepath field") | ||
| } | ||
| } | ||
| } | ||
| }) | ||
| } | ||
| } | ||
|
|
||
| func TestSearchCommands(t *testing.T) { | ||
| tests := []struct { | ||
| name string | ||
| path string | ||
| pattern string | ||
| maxResults int | ||
| wantError bool | ||
| wantMinMatches int | ||
| wantMaxMatches int | ||
| }{ | ||
| { | ||
| name: "search for ovs commands", | ||
| path: sosreportTestData, | ||
| pattern: "ovs", | ||
| maxResults: 100, | ||
| wantError: false, | ||
| wantMinMatches: 2, // At least the 2 ovs commands | ||
| wantMaxMatches: 100, | ||
| }, | ||
| { | ||
| name: "search for ip command", | ||
| path: sosreportTestData, | ||
| pattern: "ip.*show", | ||
| maxResults: 100, | ||
| wantError: false, | ||
| wantMinMatches: 1, // ip addr show | ||
| wantMaxMatches: 100, | ||
| }, | ||
| { | ||
| name: "search with no matches", | ||
| path: sosreportTestData, | ||
| pattern: "nonexistent-pattern-xyz", | ||
| maxResults: 100, | ||
| wantError: false, | ||
| wantMinMatches: 0, | ||
| wantMaxMatches: 0, | ||
| }, | ||
| { | ||
| name: "invalid regex pattern", | ||
| path: sosreportTestData, | ||
| pattern: "[invalid(", | ||
| maxResults: 100, | ||
| wantError: true, | ||
| }, | ||
| } | ||
|
|
||
| for _, tt := range tests { | ||
| t.Run(tt.name, func(t *testing.T) { | ||
| result, err := searchCommands(tt.path, tt.pattern, tt.maxResults) | ||
| if tt.wantError { | ||
| if err == nil { | ||
| t.Errorf("searchCommands() expected error but got nil") | ||
| } | ||
| return | ||
| } | ||
| if err != nil { | ||
| t.Errorf("searchCommands() unexpected error = %v", err) | ||
| return | ||
| } | ||
|
|
||
| if result.Total < tt.wantMinMatches { | ||
| t.Errorf("searchCommands() got %d matches, want at least %d", result.Total, tt.wantMinMatches) | ||
| } | ||
|
|
||
| if result.Total > tt.wantMaxMatches { | ||
| t.Errorf("searchCommands() got %d matches, want at most %d", result.Total, tt.wantMaxMatches) | ||
| } | ||
|
|
||
| if len(result.Matches) != result.Total { | ||
| t.Errorf("searchCommands() matches slice length %d doesn't match Total %d", len(result.Matches), result.Total) | ||
| } | ||
|
|
||
| // Verify match structure | ||
| for _, match := range result.Matches { | ||
| if match.Plugin == "" { | ||
| t.Errorf("searchCommands() match has empty Plugin field") | ||
| } | ||
| if match.Exec == "" { | ||
| t.Errorf("searchCommands() match has empty Exec field") | ||
| } | ||
| if match.Filepath == "" { | ||
| t.Errorf("searchCommands() match has empty Filepath field") | ||
| } | ||
| } | ||
| }) | ||
| } | ||
| } |
There was a problem hiding this comment.
These tests can be moved to commands_test.go file.
pkg/sosreport/mcp/logs.go
Outdated
| return "", err | ||
| } | ||
|
|
||
| if maxResults == 0 { |
There was a problem hiding this comment.
Should we validate maxResults so that negative values are not passed? The behavior of this function can differ from readWithLimit function because of how the limits are being checked:
There was a problem hiding this comment.
Modified the check to default to 100 if <= 0
pkg/sosreport/mcp/commands.go
Outdated
| } | ||
| } | ||
|
|
||
| if maxLines == 0 { |
There was a problem hiding this comment.
Should maxLines in getCommandOutput function be validated for negative values? Related to https://github.com/ovn-kubernetes/ovn-kubernetes-mcp/pull/2/files#r2571904444
There was a problem hiding this comment.
I don't think the model will try to use negative values and I modified the defaulting, we should be good.
Implements MCP tools for analyzing sosreports: - sos-list-plugins: Lists enabled sosreport plugins with their command counts - sos-list-commands: Returns all commands executed by a specific plugin - sos-search-commands: Searches for commands matching a pattern across all plugins - sos-get-command: Retrieves command output from sosreport with optional regex filtering - sos-search-pod-logs: Searches Kubernetes pod container logs with pattern matching Assisted-By: Claude Sonnet 4.5 Signed-off-by: Patryk Diak <pdiak@redhat.com>
988eff5 to
cbfa248
Compare
Implements MCP tools for analyzing sosreports:
Assisted-By: Claude Sonnet 4.5
Signed-off-by: Patryk Diak pdiak@redhat.com
Example using claude code when asked about ovs drops:
Based on #1