This repository defines a couple of Model Context Protocol tools for debugging OpenShift nodes and collecting logs. The tools are implemented in pkg/sdkserver/tools.go
and can be registered with any MCP server built using the mcp-go
SDK.
import (
"github.com/harche/crio-mcp-server/pkg/sdkserver"
"github.com/mark3labs/mcp-go/server"
)
s := server.NewMCPServer("demo", "1.0.0")
sdkserver.RegisterTools(s)
Runs oc debug
on a specified node and executes arbitrary shell commands inside the temporary debug pod.
Arguments:
node_name
(string, required) – node to debugcommands
(array of string) – commands executed in the pod (defaults tojournalctl --no-pager -u crio
)collect_files
(bool) – when true, files listed inpaths
are returned as a tarball resourcepaths
(array of string) – file or directory paths to copy from the host
When collect_files
is enabled, the tool returns a application/tar+gzip
archive containing the specified paths.
Streams systemd journal and container runtime logs from a node using oc adm node-logs
.
Arguments:
node_name
(string, required) – target nodesince
(string) – RFC3339 timestamp or relative value accepted byjournalctl
compress
(bool) – if true, return logs as a gzip tarball instead of inline text The zipped data is returned as a blob resource namednode-logs-<node>.txt.gz
.
Runs go tool pprof
with the supplied arguments to inspect CPU or memory profiles. Refer to go tool pprof -h
for the full set of options.
Arguments:
args
(array of string, required) – command-line arguments passed directly togo tool pprof
Runs oc adm must-gather
to capture cluster information. Create a temporary directory and pass it using dest_dir
to keep all gathered data in one location. Explore oc adm must-gather -h
for the full set of options.
oc adm must-gather can scoop up almost every artifact engineers or support need in a single shot: it exports the full YAML for all cluster-scoped and namespaced resources (Deployments, CRDs, Nodes, ClusterOperators, etc.); captures pod and container logs as well as systemd journal slices from each node to trace runtime crashes or OOMs; grabs API-server and OAuth audit logs for security or compliance forensics; collects kernel, cgroup, and other node sysinfo plus tuned and kubelet configs for performance tuning; optionally runs add-on scripts such as gather_network_logs to archive iptables/OVN flows and CNI pod logs, or gather_profiling_node to fetch 30-second CPU and heap pprof dumps from both kubelet and CRI-O for hotspot analysis; and, through plug-in images, can extend to operator-specific data like storage states or virtualization metrics, ensuring one reproducible tarball contains configuration, logs, network traces, performance profiles, and security audits for thorough offline debugging.
Arguments:
dest_dir
(string) – local directory where the must-gather output is storedextra_args
(array of string) – additional flags forwarded tooc adm must-gather
These helpers can be integrated into a custom MCP server or used directly with the mcp-go
SDK.
Runs sosreport
inside a debug pod using toolbox. This captures detailed diagnostics from a node. Provide a Red Hat case ID if available.
Arguments:
node_name
(string, required) – node from which to gather the reportcase_id
(string) – optional support case identifier passed tososreport
Runs crictl
inside a debug pod to interact directly with the node's container runtime. Use the -h
flag on any subcommand for help.
crictl is the lightweight command-line client from the cri-tools project that
speaks the Kubernetes Container Runtime Interface (CRI) directly. Because it
talks to the node’s container runtime (CRI-O, containerd, etc.) over the local
/var/run/.sock, it works even when kubelet or the API server are
unhealthy. Common commands include crictl ps
and crictl pods
to list
running containers or sandboxes, crictl inspect
/inspectp
for JSON-formatted
metadata, crictl logs
to read container stdout, crictl exec
for a shell,
crictl images
and crictl pull
to manage images, crictl stats
for live CPU
and memory usage, and crictl runp|create|start
to launch test sandboxes.
Because it bypasses Kubernetes control-plane layers, crictl is the first tool
engineers reach for when debugging low-level runtime or cgroup issues on an
OpenShift node.
Arguments:
node_name
(string, required) – node on which to run the commandargs
(array of string) – arguments forwarded tocrictl
(defaults tops
)
Drops a debug pod onto a node and walks its unified cgroup-v2 hierarchy. By default it lists memory.current
for every pod under /sys/fs/cgroup/kubepods.slice
, but you can supply custom commands to inspect other files.
Cgroup files are the ground truth for how the Linux kernel enforces every pod’s CPU, memory, I/O and PIDs limits. Reading cpu.max
, memory.max
, io.stat
, pids.max
or pressure-stall metrics straight from /sys/fs/cgroup/kubepods.slice/...
lets you verify that the values the kubelet intended actually reached the kernel; spot runaway memory or CPU throttling even when metrics-server is down; correlate CRI-O OOM-kills with mis-configured requests; and confirm that topology-aware features like CPU Manager wrote the right cpuset.cpus
mask.
Arguments:
node_name
(string, required) – node whose cgroupfs should be inspectedcommands
(array of string) – optional shell commands to run inside the debug pod
Runs the gather_network_logs
must-gather addon to capture iptables and OVN flows along with CNI pod logs.
Arguments:
dest_dir
(string) – directory where the network logs are stored
Collects 30-second CPU and heap profiles from kubelet and CRI-O using the gather_profiling_node
script.
Arguments:
dest_dir
(string) – directory where the profiling output is written
Fetches recent Kubernetes events from all namespaces via oc get events -A
.
Runs oc adm top nodes
to gather CPU and memory usage for each node.
Executes a PromQL query against the cluster Prometheus service via oc get --raw
.
Arguments:
query
(string, required) – the PromQL expression to run
Retrieves logs from a specific pod similar to oc logs
.
Arguments:
namespace
(string, required) – namespace of the podpod_name
(string, required) – pod to read logs fromcontainer
(string) – optional container within the podsince
(string) – optional duration (e.g.5m
) to limit logs
Uses oc debug
to print kubelet and CRI-O configuration files from the node.
Arguments:
node_name
(string, required) – node to inspect
Queries the Red Hat Knowledge Base using the Case Management API.
Arguments:
query
(string, required) – search keywordsrows
(number) – number of results to return (default 20)offline_token
(string, required) – offline access token for authentication
Retrieves CVE details from the Red Hat Security Data API.
Arguments:
cve_id
(string, required) – identifier likeCVE-2025-1234