Skip to content

CORENET-7091: Add enhancement proposal to productize ovn-kubernetes MCP tools#2002

Open
arkadeepsen wants to merge 1 commit into
openshift:masterfrom
arkadeepsen:ovnk-mcp
Open

CORENET-7091: Add enhancement proposal to productize ovn-kubernetes MCP tools#2002
arkadeepsen wants to merge 1 commit into
openshift:masterfrom
arkadeepsen:ovnk-mcp

Conversation

@arkadeepsen
Copy link
Copy Markdown
Member

No description provided.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 8, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented May 8, 2026

@arkadeepsen: This pull request references CORENET-7091 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.


OVN-Kubernetes operators and support engineers often need Northbound and Southbound database views (`ovn-nbctl`, `ovn-sbctl`, traces, logical flows) while investigating connectivity and routing. These tools are already implemented in ovn-kubernetes-mcp, but OpenShift users benefit from consuming them via a **single MCP server** that shares authentication, tool governance, and documentation with the rest of the platform troubleshooting surface.

The primary motivation for landing these tools in upstream kubernetes-mcp-server is **productization via downstream sync into openshift-mcp-server**. By first integrating the OVN toolset upstream, OpenShift can ship and support the same upstream code through the established downstream pipeline.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of productization, can we say that keeping all Openshft related MCP servers in a single repository is the main motivation? or we can keep both.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a line stating that the ovnk tools can be consumed from the same ocp mcp server.

kms --> Sync
```

**Downstream.** openshift-mcp-server consumes kubernetes-mcp-server changes through its normal fork sync or vendor workflow (exact mechanics follow that repository’s documented process).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dont we want to add any more implementation details, like what exact tools will be added and what purpose those may serve?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added that all the tools under ovn and ovs packages will be added to the OCP MCP server. I have added some more details about how to go about the implementation. I didn't want to add specific details of the the local PoC I did as that might not be the only way of implementing the integration.

- Add an `ovn-kubernetes` toolset to kubernetes-mcp-server that reuses the existing OVN MCP tool implementations from ovn-kubernetes-mcp, rather than re-implementing equivalent functionality.
- Enable kubernetes-mcp-server to execute OVN tool commands in-cluster using its existing pod-exec capabilities, with only minor upstream refactoring required in the imported OVN tools.
- Import the OVN and OVS layers from ovn-kubernetes-mcp incrementally (starting with core OVN/OVS troubleshooting tools), expanding coverage as dependencies and eval coverage mature.
- Make the toolset available to OpenShift users through openshift-mcp-server via downstream sync from kubernetes-mcp-server.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is having an automated sync mechanism between ovn-mcp-server, kubernetes-mcp-server and openshift-mcp-server also a goal of this feature?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current plan is to import the the packages from ovn-kubernetes-mcp repo. Thus, whenever we need the latest changes in kubernetes-mcp-server, the go.mod and go.sum files can be updated to refer to the latest changes from ovn-kubernetes-mcp repo. Regarding the automation, since kubernetes-mcp-server is in a separate upstream repo where we are not maintainers, not sure whether adding the automatic sync process as part of this EP would be appropriate. We can figure that part out, if needed, in the future. For now, we'll just bump the import as we do for k8s bump in the different repos.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that must-gather is downstream specific, bringing it into the kubernetes-mcp-server would not be a problem, right?

Copy link
Copy Markdown
Member Author

@arkadeepsen arkadeepsen May 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's already an existing downstream effort for must-gather. It differs from how it's been implemented in ovn-kubernetes-mcp repo. If we want to integrate the networking bits from the must-gather tool, we'll have to do it in the openshift-mcp-server directly, as kubernetes-mcp-server won't have must-gather related tools.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, we can skip using must-gather tool from ovn-kubernetes-mcp and use the existing one. We can try to directly add networking bits to kubernetes-mcp-server to imitate behaviour in ovn-kubernetes-mcp. Can we consider this one of the goals?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will not work on kubernetes-mcp-server as the must-gather implementation is in downstream openshift-mcp-server.

@arkadeepsen arkadeepsen force-pushed the ovnk-mcp branch 2 times, most recently from aaccb39 to bbf81a5 Compare May 12, 2026 15:54
- Add an `ovn-kubernetes` toolset to kubernetes-mcp-server that reuses the existing OVN MCP tool implementations from ovn-kubernetes-mcp, rather than re-implementing equivalent functionality.
- Enable kubernetes-mcp-server to execute OVN tool commands in-cluster using its existing pod-exec capabilities, with only minor upstream refactoring required in the imported OVN tools.
- Import the OVN and OVS layers from ovn-kubernetes-mcp incrementally (starting with core OVN/OVS troubleshooting tools), expanding coverage as dependencies and eval coverage mature.
- Make the toolset available to OpenShift users through openshift-mcp-server via downstream sync from kubernetes-mcp-server.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that must-gather is downstream specific, bringing it into the kubernetes-mcp-server would not be a problem, right?


- Add an `ovn-kubernetes` toolset to kubernetes-mcp-server that reuses the existing OVN MCP tool implementations from ovn-kubernetes-mcp, rather than re-implementing equivalent functionality.
- Enable kubernetes-mcp-server to execute OVN/OVS tool commands in-cluster using its existing pod-exec capabilities, with only minor refactoring required in **ovn-kubernetes-mcp** and **kubernetes-mcp-server** to integrate that pod-exec path cleanly.
- Import the full OVN and OVS handler set from ovn-kubernetes-mcp (`pkg/ovn/mcp` and `pkg/ovs/mcp`) into the `ovn-kubernetes` toolset, while other upstream packages stay excluded per Non-Goals.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per today's discussion, we should mention kernel and sosreport tools which are helpful to explore node's kernel resources.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already changed.

- Full parity in the first iteration with every tool category shipped by the standalone ovn-kubernetes-mcp binary (for example kernel diagnostics, optional images such as pwru/tcpdump, must-gather, sosreport) where those require separate dependencies, images, or workflows.
- New Kubernetes or OpenShift APIs, CRDs, operators, or cluster-side agents solely for this feature.
- Replacing existing CLI-based troubleshooting; MCP tools are an additional interface.
- Importing ovn-kubernetes-mcp tools under `kernel` and `network-tools` packages in the first iteration, since those tools depend on a node debugging capability (for example a node-debug tool) that is not currently available in kubernetes-mcp-server.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need to remove this.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already changed.


### Non-Goals

- Full parity in the first iteration with every tool category shipped by the standalone ovn-kubernetes-mcp binary (for example kernel diagnostics, optional images such as pwru/tcpdump, must-gather, sosreport) where those require separate dependencies, images, or workflows.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since ovn-kubernetes-mcp is an upstream repo, we can't expect all current and future tools to be applicable to an OpenShift environment.
Given that we plan to import the packages from ovn-kubernetes-mcp repo, how should we control access to tools that may not be supported?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're only going to call the handlers of the tools which are supported. The import is for the packages where these handlers are defined. Unsupported handlers should not be used.

- https://redhat.atlassian.net/browse/CORENET-7091
see-also:
- https://github.com/ovn-kubernetes/ovn-kubernetes-mcp
- https://github.com/containers/kubernetes-mcp-server
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: do we need kubernetes-mcp-server and openshift-mcp-server here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Since the implementation of the EP will impact all the repos, we need all of them to be incuded here.

### User Stories

- As a cluster administrator or platform engineer, I want OVN-Kubernetes MCP troubleshooting tools in the same MCP server I already use for Kubernetes resources, so that I do not have to deploy, operate, or manage authentication for a second MCP server dedicated only to OVN-Kubernetes.
- As a support engineer, I want MCP clients to expose the full ovn-kubernetes-mcp troubleshooting surface that kubernetes-mcp-server imports—NB/SB inspection and related `ovn-*` workflows (including `get`, `lflow-list`, `trace` where those tools apply), OVS bridge and OpenFlow helpers, and **`kernel`** / **`network-tools`** host and capture tooling—so that assisted troubleshooting matches how other cluster operations are automated without switching servers or credentials mid-incident.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- As a support engineer, I want MCP clients to expose the full ovn-kubernetes-mcp troubleshooting surface that kubernetes-mcp-server importsNB/SB inspection and related `ovn-*` workflows (including `get`, `lflow-list`, `trace` where those tools apply), OVS bridge and OpenFlow helpers, and **`kernel`** / **`network-tools`** host and capture toolingso that assisted troubleshooting matches how other cluster operations are automated without switching servers or credentials mid-incident.
- As a support engineer, I want MCP clients to expose the full ovn-kubernetes-mcp troubleshooting surface that kubernetes-mcp-server imports (NB/SB inspection and related `ovn-*` workflows (including `get`, `lflow-list`, `trace` where those tools apply), OVS bridge and OpenFlow helpers, and **`kernel`** / **`network-tools`** host and capture tooling) so that assisted troubleshooting matches how other cluster operations are automated without switching servers or credentials mid-incident.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's already a bracket between the dashes.

- Add an `ovn-kubernetes` toolset to kubernetes-mcp-server that reuses the existing OVN MCP tool implementations from ovn-kubernetes-mcp, rather than re-implementing equivalent functionality.
- Enable kubernetes-mcp-server to execute OVN tool commands in-cluster using its existing pod-exec capabilities, with only minor upstream refactoring required in the imported OVN tools.
- Import the OVN and OVS layers from ovn-kubernetes-mcp incrementally (starting with core OVN/OVS troubleshooting tools), expanding coverage as dependencies and eval coverage mature.
- Make the toolset available to OpenShift users through openshift-mcp-server via downstream sync from kubernetes-mcp-server.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, we can skip using must-gather tool from ovn-kubernetes-mcp and use the existing one. We can try to directly add networking bits to kubernetes-mcp-server to imitate behaviour in ovn-kubernetes-mcp. Can we consider this one of the goals?


**Importing upstream tools into kubernetes-mcp-server.** The OVN troubleshooting MCP tools already exist in ovn-kubernetes-mcp. The integration approach for kubernetes-mcp-server is to add an `ovn-kubernetes` toolset that reuses those implementations as imported packages and exposes them through kubernetes-mcp-server’s tool registration.

**Command execution strategy.** OVN/OVS tools run commands inside OVN-Kubernetes pods via kubernetes-mcp-server’s pod exec. **`kernel`** and **`network-tools`** handlers use the node-level execution contract wired up in the same integration (for example debug pod or node-targeted exec, as the upstream packages require). Imported libraries should delegate all cluster I/O to kubernetes-mcp-server rather than opening separate Kubernetes client connections. Expect **refactoring in ovn-kubernetes-mcp and kubernetes-mcp-server** so each category uses a clear, single host-supplied execution path per invocation.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that kubernetes-macp-server is building its own node-debug method to allow host access using kubectl/oc CLI. However, in ovn-kubernetes-mcp we use a different method to do node debug for kernel and other network tools. I wonder how we can use tools from ovn-kubernetes-mcp while using the utility from kubernetes-mcp-server, considering it's downstream of ovn-kubernetes-mcp.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same way we'll use pod-exec from kubernetes-mcp-server for the OVN/OVS tools. The function definition should be similar, that is, the argument list and the return type should be same in both ovn-kubernetes-mcp and kubernetes-mcp-server, for the node-debug function, which will be called by the kernel and the network-tools handlers.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we mention this explicitly in the document? for what I understand the current kubernetes-mcp-server does not have any node-debug method capability so far. so if that needs to be implemented is worth to call it out in this section.


**Command execution strategy.** OVN/OVS tools run commands inside OVN-Kubernetes pods via kubernetes-mcp-server’s pod exec. **`kernel`** and **`network-tools`** handlers use the node-level execution contract wired up in the same integration (for example debug pod or node-targeted exec, as the upstream packages require). Imported libraries should delegate all cluster I/O to kubernetes-mcp-server rather than opening separate Kubernetes client connections. Expect **refactoring in ovn-kubernetes-mcp and kubernetes-mcp-server** so each category uses a clear, single host-supplied execution path per invocation.

**Scope.** All troubleshooting tools under ovn-kubernetes-mcp **`ovn`**, **`ovs`**, **`kernel`**, and **`network-tools`** belong to this effort (NB/SB inspection, logical flows, OVN trace, OVS bridge and OpenFlow helpers, kernel-oriented diagnostics, and **`network-tools`**-style capture where applicable). Other ovn-kubernetes-mcp surfaces—must-gather, sosreport, and similar—remain out of scope unless separately agreed; see Non-Goals.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
**Scope.** All troubleshooting tools under ovn-kubernetes-mcp **`ovn`**, **`ovs`**, **`kernel`**, and **`network-tools`** belong to this effort (NB/SB inspection, logical flows, OVN trace, OVS bridge and OpenFlow helpers, kernel-oriented diagnostics, and **`network-tools`**-style capture where applicable). Other ovn-kubernetes-mcp surfacesmust-gather, sosreport, and similarremain out of scope unless separately agreed; see Non-Goals.
**Scope.** All troubleshooting tools under ovn-kubernetes-mcp **`ovn`**, **`ovs`**, **`kernel`**, and **`network-tools`** belong to this effort (NB/SB inspection, logical flows, OVN trace, OVS bridge and OpenFlow helpers, kernel-oriented diagnostics, and **`network-tools`**-style capture where applicable). Other ovn-kubernetes-mcp surfaces (must-gather, sosreport, and similar) remain out of scope unless separately agreed; see Non-Goals.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 13, 2026

@arkadeepsen: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Copy link
Copy Markdown

@taanyas taanyas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm


## Open Questions

- How to structure mcpchecker suites or task labels so OVN/OVS, **`kernel`**, and **`network-tools`** coverage stays maintainable under kubernetes-mcp-server’s pass-rate gates, given differing cluster prerequisites?
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the mcpchecker structure — since kernel and network-tools require privileged node access which may not be available in all CI environments, would it make sense to have separate suites for OVN/OVS and kernel/network-tools so their pass rates are tracked independently?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am more inclined towards creating a separate suite for each layer of ovnk mcp server tools. That is for each of OVN, OVS, kernel and network-tools, we'll have separate evals suites. But we can take a call when working on the evals for the tools.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 13, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: taanyas
Once this PR has been reviewed and has the lgtm label, please assign abhat for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Copy Markdown

@mattedallo mattedallo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

I added some "non blocking" comments.


OVN-Kubernetes operators and support engineers often need Northbound and Southbound database views (`ovn-nbctl`, `ovn-sbctl`, traces, logical flows), host-oriented diagnostics, and packet or kernel-level capture workflows while investigating connectivity and routing. These tools are already implemented in ovn-kubernetes-mcp, but OpenShift users benefit from consuming them via a **single MCP server** that shares authentication, tool governance, and documentation with the rest of the platform troubleshooting surface.

The primary motivation for landing these tools in upstream kubernetes-mcp-server is **productization via downstream sync into openshift-mcp-server**. By first integrating the OVN toolset upstream, OpenShift can ship and support the same upstream code through the established downstream pipeline. This also lets OpenShift customers consume the OVN-Kubernetes tools from the same MCP server as the rest of the platform troubleshooting surface, openshift-mcp-server, after downstream sync.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: maybe we can expand a bit what is the cost we are saving on exploiting the existing openshift-mcp-server productization pipeline.
That will strength the motivation of integrating versus keeping it separate.


None. This work adds MCP tools only and does not extend the OpenShift or Kubernetes API surface.

### Topology Considerations
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor note : the topology section seems written with the local binary deployment model in mind. It might be worth a brief mention that the same considerations apply for in-cluster deployments, or a note that the OVN-K tools inherit whatever cluster-access model kubernetes-mcp-server provides.


**Importing upstream tools into kubernetes-mcp-server.** The OVN troubleshooting MCP tools already exist in ovn-kubernetes-mcp. The integration approach for kubernetes-mcp-server is to add an `ovn-kubernetes` toolset that reuses those implementations as imported packages and exposes them through kubernetes-mcp-server’s tool registration.

**Command execution strategy.** OVN/OVS tools run commands inside OVN-Kubernetes pods via kubernetes-mcp-server’s pod exec. **`kernel`** and **`network-tools`** handlers use the node-level execution contract wired up in the same integration (for example debug pod or node-targeted exec, as the upstream packages require). Imported libraries should delegate all cluster I/O to kubernetes-mcp-server rather than opening separate Kubernetes client connections. Expect **refactoring in ovn-kubernetes-mcp and kubernetes-mcp-server** so each category uses a clear, single host-supplied execution path per invocation.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we mention this explicitly in the document? for what I understand the current kubernetes-mcp-server does not have any node-debug method capability so far. so if that needs to be implemented is worth to call it out in this section.


**Split of work:** kubernetes-mcp-server decides how each capability is exposed to MCP users (tool names and parameters). ovn-kubernetes-mcp keeps handler logic that validates inputs, builds command lines, and defines execution contracts; kubernetes-mcp-server integrates by calling those libraries and supplying pod exec, node-level debugging, or other supported cluster operations against the target cluster.

```mermaid
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the diagram few things tripped me up:

  • The main call relationship (kubernetes-mcp-server's tool handler calling ovn-kubernetes-mcp's imported handler logic) isn't shown that's the core of the integration.
  • "delegated_in_cluster_execution" sits inside the ovn-kubernetes-mcp box, but the actual execution will happen in kubernetes-mcp-server's client AFAIU. ovn-kubernetes-mcp defines the contract/interface; kubernetes-mcp-server implements it.
  • The box only shows "OVN_OVS" but kernel and network-tools are also in scope, with a different execution path (node-debug vs pod-exec).
  • The two subgraphs connected by a dotted arrow could be read as two separate services communicating at runtime, when in practice ovn-kubernetes-mcp will be compiled into kubernetes-mcp-server as an imported Go package.

Would something like this be more accurate? Let me know your thoughts

flowchart TB
    subgraph kms [kubernetes-mcp-server process]
      ToolHandler["Tool handler\n(defines MCP tool name, schema)"]
      subgraph ovnkLib ["ovn-kubernetes-mcp (imported Go package)"]
        HandlerLogic["Handler logic\n(validates inputs, builds commands)"]
      end
      subgraph executor [kubernetes-mcp-server K8s client]
        PodExec["PodExec\n(OVN/OVS tools)"]
        NodeDebug["NodeDebug\n(kernel / network-tools)"]
      end
      ToolHandler -->|"calls imported package"| HandlerLogic
      HandlerLogic -->|"calls injected executor"| PodExec
      HandlerLogic -->|"calls injected executor"| NodeDebug
      PodExec -->|"exec in ovnkube pod"| Cluster["Cluster"]
      NodeDebug -->|"privileged debug pod on node"| Cluster
    end
Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants