Skip to content

Commit 254f920

Browse files
JAORMXclaude
andcommitted
Add MCPGroup CRD proposal for Kubernetes operator
This proposal introduces MCPGroup support to the Kubernetes operator, enabling Virtual MCP Server and logical grouping of MCPServer resources. Key design decisions: - Explicit groupRef field in MCPServer spec (follows K8s naming conventions) - Simple MCPGroup CRD with minimal spec (description) and status tracking - Namespace-scoped groups for security/isolation - No webhooks - controller-based validation for simplicity - Optional group membership (unlike CLI where groups are required) Design rationale: - MCPGroup as first-class construct (not just labels) enables meta-mcp and virtual MCP to discover and aggregate backend servers - Provides seamless CLI-to-Kubernetes transition with consistent API - Explicit group lifecycle management and validation - Foundation for growing ecosystem of ToolHive constructs 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
1 parent 71e2934 commit 254f920

File tree

1 file changed

+320
-0
lines changed

1 file changed

+320
-0
lines changed
Lines changed: 320 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,320 @@
1+
# MCPGroup CRD for Kubernetes Operator
2+
3+
## Problem Statement
4+
5+
The CLI supports runtime groups for organizing MCP servers, but this is missing in Kubernetes. The Virtual MCP Server feature (PR #2106) requires groups to discover and aggregate backend servers. Without groups, Kubernetes users cannot use the Virtual MCP or organize their servers logically.
6+
7+
## Goals
8+
9+
- Add MCPGroup support to Kubernetes matching CLI runtime group behavior
10+
- Enable Virtual MCP Server to discover servers in a group
11+
- Maintain API consistency between CLI and Kubernetes
12+
- Keep implementation simple and predictable
13+
14+
## Non-Goals
15+
16+
- Registry groups (CLI-only feature)
17+
- Cross-namespace groups
18+
- Multi-group membership per server
19+
- Client configuration management (not applicable in Kubernetes)
20+
21+
## Design
22+
23+
### Design Decision: MCPGroup CRD vs Labels/Annotations
24+
25+
**Question:** Could we use labels/annotations on MCPServer instead of creating an MCPGroup CRD?
26+
27+
**Answer:** We need MCPGroup as a first-class construct for several reasons:
28+
29+
1. **Meta-MCP and Virtual MCP requirements**: These features need to aggregate multiple MCP servers. They need a way to:
30+
- Discover which servers belong to a group
31+
- Reference groups in their configuration
32+
- Watch for group membership changes
33+
34+
2. **Seamless CLI-to-Kubernetes transition**: The CLI has an explicit Group concept that workloads belong to. Users migrating from CLI to Kubernetes expect the same mental model and API patterns.
35+
36+
3. **Growing ecosystem of constructs**: As we build more features on top of ToolHive (meta-mcp, virtual MCP, future aggregation patterns), we need a consistent way to represent server collections.
37+
38+
4. **Group as an explicit concept**: Labels are meant for flexible, ad-hoc categorization. Groups are a core organizational concept in ToolHive's architecture, deserving explicit representation.
39+
40+
While labels could technically provide grouping, they lack:
41+
- Discoverability (no list of available groups without scanning all servers)
42+
- A place for group-level metadata or status
43+
- Explicit lifecycle management
44+
- Ability to validate references before use
45+
46+
**Conclusion:** MCPGroup CRD provides the foundation for meta-mcp, virtual MCP, and future aggregation features while maintaining consistency with CLI semantics.
47+
48+
### MCPGroup CRD
49+
50+
Simple CRD for grouping servers:
51+
52+
```yaml
53+
apiVersion: mcp.toolhive.stacklok.io/v1alpha1
54+
kind: MCPGroup
55+
metadata:
56+
name: engineering-team
57+
namespace: default
58+
spec:
59+
# Optional human-readable description
60+
description: "Engineering team MCP servers"
61+
62+
status:
63+
# Number of servers in this group
64+
serverCount: 3
65+
66+
# List of server names for quick reference
67+
servers:
68+
- github-server
69+
- jira-server
70+
- slack-server
71+
72+
phase: Ready
73+
conditions:
74+
- type: Ready
75+
status: "True"
76+
lastTransitionTime: "2025-10-15T10:30:00Z"
77+
```
78+
79+
### MCPServer Spec Addition
80+
81+
Add explicit group field to MCPServer:
82+
83+
```yaml
84+
apiVersion: mcp.toolhive.stacklok.io/v1alpha1
85+
kind: MCPServer
86+
metadata:
87+
name: github-server
88+
namespace: default
89+
spec:
90+
# Existing fields...
91+
image: ghcr.io/stackloklabs/github-server:latest
92+
93+
# New: explicit group membership
94+
groupRef: engineering-team
95+
```
96+
97+
**Rationale for explicit groupRef field:**
98+
- Matches CLI behavior (workload has `Group` field)
99+
- Follows Kubernetes naming conventions for references (`groupRef` instead of `group`)
100+
- Simple and predictable
101+
- Easy to query: `list MCPServers where spec.groupRef = X`
102+
- No confusion about membership
103+
- API consistency with CLI
104+
105+
### API Consistency
106+
107+
CLI runtime groups store membership on the workload:
108+
```go
109+
type Workload struct {
110+
Name string
111+
Group string // Explicit group membership
112+
}
113+
```
114+
115+
Kubernetes should match this pattern:
116+
```go
117+
type MCPServerSpec struct {
118+
// Existing fields...
119+
120+
// GroupRef is the name of the MCPGroup this server belongs to
121+
// +optional
122+
GroupRef string `json:"groupRef,omitempty"`
123+
}
124+
```
125+
126+
### Controller Behavior
127+
128+
**MCPGroup Controller:**
129+
- Watches MCPGroup and MCPServer resources
130+
- Updates `status.servers` list when servers join/leave group
131+
- Updates `status.serverCount`
132+
- Validates referenced group exists when MCPServer is created
133+
134+
**MCPServer Controller:**
135+
- Existing reconciliation logic
136+
- Validates `spec.groupRef` references an existing MCPGroup (if specified)
137+
- Adds condition if group reference is invalid
138+
139+
### Discovery API
140+
141+
Virtual MCP (and other features) can discover servers in a group:
142+
143+
```go
144+
// List all MCPServers in a group
145+
servers, err := clientset.McpV1alpha1().MCPServers(namespace).List(ctx, metav1.ListOptions{
146+
FieldSelector: "spec.groupRef=engineering-team",
147+
})
148+
```
149+
150+
## Implementation
151+
152+
### Phase 1: Core CRD
153+
1. Add `GroupRef` field to MCPServer spec
154+
2. Create MCPGroup CRD types
155+
3. Implement MCPGroup controller
156+
4. Add field selector support for group queries
157+
5. Update CRD manifests and documentation
158+
159+
### Phase 2: Integration
160+
1. Virtual MCP integration with groups
161+
2. kubectl plugin support
162+
163+
## Examples
164+
165+
### Standalone MCPServer (No Group)
166+
167+
MCPServers can run without belonging to a group:
168+
169+
```yaml
170+
apiVersion: mcp.toolhive.stacklok.io/v1alpha1
171+
kind: MCPServer
172+
metadata:
173+
name: standalone-server
174+
namespace: default
175+
spec:
176+
image: ghcr.io/stackloklabs/filesystem:latest
177+
# No groupRef - server runs independently
178+
```
179+
180+
### MCPServer with Group Membership
181+
182+
```yaml
183+
# Create group
184+
apiVersion: mcp.toolhive.stacklok.io/v1alpha1
185+
kind: MCPGroup
186+
metadata:
187+
name: engineering-team
188+
namespace: default
189+
spec:
190+
description: "Engineering team servers"
191+
---
192+
# Create servers in group
193+
apiVersion: mcp.toolhive.stacklok.io/v1alpha1
194+
kind: MCPServer
195+
metadata:
196+
name: github-server
197+
spec:
198+
image: ghcr.io/stackloklabs/github:latest
199+
groupRef: engineering-team
200+
---
201+
apiVersion: mcp.toolhive.stacklok.io/v1alpha1
202+
kind: MCPServer
203+
metadata:
204+
name: jira-server
205+
spec:
206+
image: ghcr.io/company/jira:latest
207+
groupRef: engineering-team
208+
```
209+
210+
### Virtual MCP Usage
211+
212+
```yaml
213+
# Virtual MCP references the group
214+
# NOTE: This is an example of future MCPVirtualServer API (not yet implemented)
215+
apiVersion: mcp.toolhive.stacklok.io/v1alpha1
216+
kind: MCPVirtualServer
217+
metadata:
218+
name: engineering-virtual
219+
spec:
220+
# References existing group
221+
groupRef: engineering-team
222+
223+
# Virtual MCP configuration
224+
aggregation:
225+
conflictResolution: prefix
226+
```
227+
228+
### Querying Servers in Group
229+
230+
```bash
231+
# List all servers in a group
232+
kubectl get mcpservers -n default --field-selector spec.groupRef=engineering-team
233+
234+
# Check group status
235+
kubectl get mcpgroup engineering-team -o jsonpath='{.status.servers}'
236+
```
237+
238+
## Migration from CLI
239+
240+
CLI groups and Kubernetes groups are separate concepts:
241+
- **CLI groups**: Local runtime groups (`.toolhive/` directory)
242+
- **K8s groups**: Namespace-scoped groups (etcd)
243+
244+
**Key differences from CLI:**
245+
- In CLI: All servers must belong to a group (defaults to "default" group if not specified)
246+
- In K8s: Servers can optionally belong to a group (`spec.groupRef` is optional)
247+
248+
No automatic migration - users manually create MCPGroup resources and set `spec.groupRef` on MCPServers.
249+
250+
## Type Definitions
251+
252+
```go
253+
// MCPGroupSpec defines the desired state of MCPGroup
254+
type MCPGroupSpec struct {
255+
// Description provides human-readable context
256+
// +optional
257+
Description string `json:"description,omitempty"`
258+
}
259+
260+
// MCPGroupStatus defines observed state
261+
type MCPGroupStatus struct {
262+
// Phase indicates current state
263+
// +optional
264+
Phase MCPGroupPhase `json:"phase,omitempty"`
265+
266+
// Servers lists server names in this group
267+
// +optional
268+
Servers []string `json:"servers,omitempty"`
269+
270+
// ServerCount is the number of servers
271+
// +optional
272+
ServerCount int `json:"serverCount"`
273+
274+
// Conditions represent observations
275+
// +optional
276+
Conditions []metav1.Condition `json:"conditions,omitempty"`
277+
}
278+
279+
type MCPGroupPhase string
280+
281+
const (
282+
MCPGroupPhaseReady MCPGroupPhase = "Ready"
283+
)
284+
285+
// Add to MCPServerSpec
286+
type MCPServerSpec struct {
287+
// Existing fields...
288+
289+
// GroupRef is the MCPGroup this server belongs to
290+
// Must reference an existing MCPGroup in the same namespace
291+
// +optional
292+
GroupRef string `json:"groupRef,omitempty"`
293+
}
294+
```
295+
296+
## Open Questions
297+
298+
1. **Should groupRef be immutable after creation?**
299+
- Recommendation: Allow changes, easier for user corrections
300+
301+
2. **What happens if group is deleted?**
302+
- Recommendation: Servers continue running, `spec.groupRef` becomes dangling reference
303+
- Controller will log errors and add conditions to affected MCPServer resources
304+
305+
3. **Should we validate group exists on MCPServer create?**
306+
- Recommendation: Yes, via controller reconciliation
307+
- Controller validates groupRef and adds status conditions if invalid
308+
- No webhook needed - keep implementation simple
309+
310+
## Future Enhancements
311+
312+
- Group-level policies and authorization
313+
- Cross-namespace groups (with security review)
314+
- Group quotas and resource limits
315+
316+
## Testing
317+
318+
- **Unit**: Group validation, status updates
319+
- **Integration (envtest)**: Controller reconciliation, field selectors
320+
- **E2E (Chainsaw)**: Complete group lifecycle, Virtual MCP integration

0 commit comments

Comments
 (0)