Add KEP for kubectl debug #1204

verb · 2019-08-06T19:11:48Z

Adds a provisional KEP proposing a new kubectl debug command based on #277. KEP is not yet fully described, but feedback is welcome.

verb · 2019-08-15T16:04:01Z

I published an example plugin at https://github.com/verb/kubectl-debug

verb · 2019-09-26T10:59:04Z

A demo of the plugin is http://bit.ly/kubectl-debug-demo

keps/sig-cli/20190805-pod-troubleshooting.md

soltysh · 2019-10-14T11:18:47Z

keps/sig-cli/20190805-pod-troubleshooting.md

+
+Options:
+  -a, --attach=true: Automatically attach to container once created.
+  -c, --container='': Container name. If omitted one will be chosen.


If omitted, the first container in the pod will be chosen is the default in other commands, eg. kubectl attach

soltysh · 2019-10-14T11:19:41Z

keps/sig-cli/20190805-pod-troubleshooting.md

+  -a, --attach=true: Automatically attach to container once created.
+  -c, --container='': Container name. If omitted one will be chosen.
+  -i, --stdin=true: Pass stdin to the container
+  -m, --image='busybox': Container image to use for debug container.


-m might be confusing, I'd drop short for now. Also I'm inclined not to default this to anything, just require a user to always pass this.

soltysh · 2019-10-14T11:20:18Z

keps/sig-cli/20190805-pod-troubleshooting.md

+#### Operations
+
+Alice runs a service "neato" that consists of a statically compiled Go binary
+running in a minimal container image. One of the its pods is suddenly having


One of ~~the~~ its pods

Fixed, thanks.

soltysh · 2019-10-14T11:21:19Z

keps/sig-cli/20190805-pod-troubleshooting.md

+running in a minimal container image. One of the its pods is suddenly having
+trouble connecting to an internal service. Being in operations, Alice wants to
+be able to inspect the running pod without restarting it, but she doesn't
+necessarily need to enter the container itself. She wants to:


That argument falls short, I'd prefer to use the same argument you used at the beginning, that she doesn't have the ability to enter the container b/c it's a scratch image.

The scenario I'm envisioning is attaching a debugger to capture state of a hard-to-reproduce bug when it occurs in production. Restarting the container will destroy the state.

soltysh · 2019-10-14T11:24:08Z

keps/sig-cli/20190805-pod-troubleshooting.md

+
+### Implementation Details/Notes/Constraints
+
+1.  There are no guaranteed resources for ad-hoc troubleshooting. If


That is rather undesirable effect, since it may cause an application downtime. I think we'll need to be explicit about that in the command's help.

I'm even inclined to move this to risks section.

Moved to risks and added that we'll be able to mitigate this once support for resizing pods has been added.

soltysh · 2019-10-14T11:28:22Z

keps/sig-cli/20190805-pod-troubleshooting.md

+  kubectl debug mypod --image=debian
+
+Options:
+  -a, --attach=true: Automatically attach to container once created.


Have you considered always attaching and removing the container when session ends, there's InterruptParent handler in https://github.com/kubernetes/kubernetes/blob/3ec5fe500d7a56be6ff6c15f916987eaa48c2e94/staging/src/k8s.io/kubectl/pkg/cmd/exec/exec.go#L136

Oh, I didn't know about InterruptParent, that seems useful. We haven't agreed on the semantics of deleting/recreating ephemeral containers yet, but it's important to me that we support multiple sequential commands in some manner. For example:

$ kubectl debug $pod ls /tmp $ kubectl debug $pod rm /tmp/bigfile

Choosing multiple ephemeral container names would be unfortunate.

Is InterruptParent invoked on the stream being interrupted, or only on signals received by kubectl?

Is InterruptParent invoked on the stream being interrupted, or only on signals received by kubectl?

Signal.

I have another question wrt attaching, is this to satisfy the use case where you can't use kubectl attach but debug with ephemeral container will work, is that correct?

soltysh · 2019-10-14T11:35:24Z

keps/sig-cli/20190805-pod-troubleshooting.md

+### Container Namespace Targeting
+
+In order to see processes running in other containers in the pod, [Process
+Namespace Sharing] should be enabled for the pod. In cases where process


This should be called out both in command help and also in constraints section, w/o it kubectl debug won't be able to that much.

I'll make sure it's called out in command help (verb/kubectl-debug@64404a9, 1a39bf5) but I'm not sure it's a constraint since we will support targeting container namespaces. Based on feedback, here's how I think it will be used:

shareProcessNamespace container targeted? See other processes? Use case

true - yes application troubleshooting

false yes yes application troubleshooting

false no no network troubleshooting

Concrete example for network troubleshooting: kubedns wants to use distroless containers but wants to be able to run an ephemeral container to be able to dig against localhost.

verb · 2019-10-29T14:27:35Z

@aylei you have a lot of expertise in this area, would you be interested in joining the effort to port your kubectl-debug plugin to kubectl proper?

aylei · 2019-10-29T15:15:32Z

@aylei you have a lot of expertise in this area, would you be interested in joining the effort to port your kubectl-debug plugin to kubectl proper?

Yes, of course. I'm willing to assist in advancing the design and code implementation. However, I don't have much experience about the community process, could you please tell me how can I help now?

verb · 2019-11-01T10:32:48Z

@aylei Great! The first step is to iterate on the proposal in this PR until the approvers at the top agree that we've fleshed out enough of the details that the proposal is "implementable". I'd say we should start with:

Please review the current state of the proposal in this PR and let me know your thoughts.
You can prepare a PR to expand the proposal and add yourself as an author.

verb · 2019-11-01T11:12:06Z

@soltysh I pushed a commit which attempts to capture the new scope we discussed. Please have a look at e995c36 and let me know if you agree.

(edited to update ref because I forgot to update the toc)

aylei

The rest LGTM, I will try to expand the todo parts if you're not working on them now. 😊

keps/sig-cli/20190805-kubectl-debug.md

aylei · 2020-01-02T08:58:02Z

keps/sig-cli/20190805-kubectl-debug.md

+```
+Examples:
+  # Create a copy of 'mypod' with the debugging image for container 'app'
+  kubectl debug mypod --copy-to=mypod-debug --image=myapp-image:debug --container=myapp


Can we override the command of the container 'myapp' via -- COMMAND?

Yes, good point. Added.

soltysh · 2020-01-09T12:20:15Z

keps/sig-cli/20190805-kubectl-debug.md

+
+Examples:
+  # Get network address configuration from pod mypod
+  kubectl debug mypod ip addr list


Given the image is required, this doesn't make sense, it's rather kubectl exec ip addr list, unless this is to run a command in the ephemeral container b/c the main is a distroless so we can't exec into it, in which case you'll probably want to add --attach=false too.

good point, I neglected to update example when making image required. I would like to be able to set a default container image for debug, but I don't want to get hung up on it for v1.

--attach must be true because that's where the output of ip addr list will go. Unless we were to create the Ephemeral Container and do an exec rather than setting Container.Args to args. There would be some advantages to using exec this way, actually.

Rather than getting hung up on this, I've removed this example to focus on the interactive debugging use case for the first version.

soltysh · 2020-01-09T12:27:24Z

keps/sig-cli/20190805-kubectl-debug.md

+                       the copy to receive traffic from a service or a replicaset
+                       to kill other pods.
+  --share-processes=true: When used with `--copy-to`, enable process namespace
+                          sharing in the pod copy.


I'd add --node-name to be able to place the pod on a specific node, additionally this will be necessary in the node debug mode.

Other options include --as-user --as-root, which would allow you to run as specific user or as root. --to-namespace would be also complementary to --copy-to.

Generally, we can have several others modifiers for the debug pod, including liveness and readiness probe (you might not care about those), as well as init containers, which might not be needed for debug.

Good point about node-name, I'll add that. Or perhaps --same-node?

It would be useful to allow arbitrary edits prior to create. What do you think about an --edit that opens an editor similar to kubectl edit?

Re: --node-name for node debug mode, I expected the node name to be a positional parameter like kubectl debug node/$node_name, but I haven't yet investigated the conventions for specifying resources on the command line.

Right, but node-name would be only for node debug, what if you want to land your pod on a specific node for other reasons.

soltysh · 2020-01-09T12:35:20Z

keps/sig-cli/20190805-kubectl-debug.md

+privileged pod constrained to this particular node and run in the host
+namespaces.
+
+TODO: how?


Just create a pod, for a specific node, with HostNetwork, HostPid, HostPath set accordingly.

Also, we don't need to have this rock solid, defined. We'll iterate as we go.

Ok, added a brief description.

soltysh · 2020-01-09T12:36:45Z

keps/sig-cli/20190805-kubectl-debug.md

+
+TODO: how?
+
+TODO: Some CLI options only make sense for pods, and some only make sense for 


I'm not strongly convinced, but apiserver did something kubernetes/kubernetes#64517 that I was planning o implementing in kubectl, eventually. But let's start simple.

SG, I'll remove for now.

soltysh · 2020-01-09T12:37:45Z

keps/sig-cli/20190805-kubectl-debug.md

+troubleshooting session might resemble:
+
+```
+% kubectl debug -it -m debian neato-5thn0 -- bash


What's -m here, and --image is missing here.

-m was the short code for --image once upon a time. 😄

Updated here and elsewhere.

soltysh · 2020-01-09T12:42:55Z

keps/sig-cli/20190805-pod-troubleshooting.md

+  kubectl debug mypod --image=debian
+
+Options:
+  -a, --attach=true: Automatically attach to container once created.


Is InterruptParent invoked on the stream being interrupted, or only on signals received by kubectl?

Signal.

I have another question wrt attaching, is this to satisfy the use case where you can't use kubectl attach but debug with ephemeral container will work, is that correct?

soltysh

I left a few more comments here and there, mostly nits. Let's get them shaped and merge this asap.

soltysh · 2020-01-09T12:43:29Z

keps/sig-cli/20190805-kubectl-debug.md

+privileged pod constrained to this particular node and run in the host
+namespaces.
+
+TODO: how?


Also, we don't need to have this rock solid, defined. We'll iterate as we go.

Co-Authored-By: Aylei <[email protected]>

- Expand command arguments in examples - Add description of how node debug works - Mark implementable

verb

@soltysh Thanks for the review. I addressed the comments and also updated the status to "implementable" so we can begin implementing for 1.18.

verb · 2020-01-09T16:32:13Z

keps/sig-cli/20190805-kubectl-debug.md

+```
+Examples:
+  # Create a copy of 'mypod' with the debugging image for container 'app'
+  kubectl debug mypod --copy-to=mypod-debug --image=myapp-image:debug --container=myapp


Yes, good point. Added.

verb · 2020-01-09T16:34:33Z

keps/sig-cli/20190805-kubectl-debug.md

+
+TODO: how?
+
+TODO: Some CLI options only make sense for pods, and some only make sense for 


SG, I'll remove for now.

verb · 2020-01-09T16:48:28Z

keps/sig-cli/20190805-kubectl-debug.md

+privileged pod constrained to this particular node and run in the host
+namespaces.
+
+TODO: how?


Ok, added a brief description.

verb · 2020-01-09T16:49:47Z

keps/sig-cli/20190805-kubectl-debug.md

+troubleshooting session might resemble:
+
+```
+% kubectl debug -it -m debian neato-5thn0 -- bash


-m was the short code for --image once upon a time. 😄

Updated here and elsewhere.

soltysh · 2020-01-15T11:41:40Z

This lgtm, if nobody has any objections I'll tag it after today's SIG-CLI.

verb · 2020-01-15T17:37:20Z

Added the test plan we discussed at today's sig-cli.

soltysh

/lgtm
/approve

k8s-ci-robot · 2020-01-15T19:10:58Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: soltysh, verb

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~keps/sig-cli/OWNERS~~ [soltysh]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Add KEP for Pod Troubleshooting in kubectl

a921f8a

k8s-ci-robot requested review from pwittrock and seans3 August 6, 2019 19:12

soltysh requested changes Oct 14, 2019

View reviewed changes

corneliusweig mentioned this pull request Oct 23, 2019

Proposal: rename 'debug' to 'debug-pod' kubernetes-sigs/krew-index#289

Closed

verb mentioned this pull request Oct 25, 2019

Ephemeral Containers #277

Closed

23 tasks

Address reviewer feedback

1a39bf5

Update kubectl debug KEP with new scope

e995c36

verb force-pushed the cli-debug branch from a65b991 to e995c36 Compare November 1, 2019 11:13

verb changed the title ~~Add KEP for Pod Troubleshooting in kubectl~~ Add KEP for kubectl debug Nov 6, 2019

verb changed the title ~~Add KEP for kubectl debug~~ Add KEP for kubectl debug Nov 6, 2019

Update kubectl debug KEP with more pod troubleshooting details

b6e6d4e

verb force-pushed the cli-debug branch from 98475f8 to b6e6d4e Compare December 6, 2019 14:29

aylei reviewed Jan 2, 2020

View reviewed changes

BenTheElder mentioned this pull request Jan 2, 2020

Ephemeral containers using kubectl-debug does not seem to work kubernetes-sigs/kind#1210

Closed

soltysh mentioned this pull request Jan 9, 2020

KEP 1441 - kubectl debug #1441

Closed

4 tasks

soltysh self-assigned this Jan 9, 2020

soltysh reviewed Jan 9, 2020

View reviewed changes

soltysh approved these changes Jan 9, 2020

View reviewed changes

Apply suggestions from code review

91443ae

Co-Authored-By: Aylei <[email protected]>

Update kubectl debug KEP and mark implementable

0472ad6

- Expand command arguments in examples - Add description of how node debug works - Mark implementable

verb commented Jan 9, 2020

View reviewed changes

kubectl debug KEP: add test plan

af65698

soltysh approved these changes Jan 15, 2020

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 15, 2020

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 15, 2020

k8s-ci-robot merged commit c283365 into kubernetes:master Jan 15, 2020

k8s-ci-robot added this to the v1.18 milestone Jan 15, 2020

verb deleted the cli-debug branch July 23, 2021 14:26


		### Implementation Details/Notes/Constraints

		1. There are no guaranteed resources for ad-hoc troubleshooting. If

shareProcessNamespace	container targeted?	See other processes?	Use case
true	-	yes	application troubleshooting
false	yes	yes	application troubleshooting
false	no	no	network troubleshooting


		TODO: how?

		TODO: Some CLI options only make sense for pods, and some only make sense for

Add KEP for kubectl debug #1204

Add KEP for kubectl debug #1204

Uh oh!

Conversation

verb commented Aug 6, 2019

Uh oh!

verb commented Aug 15, 2019

Uh oh!

verb commented Sep 26, 2019

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

verb commented Oct 29, 2019

Uh oh!

aylei commented Oct 29, 2019

Uh oh!

verb commented Nov 1, 2019

Uh oh!

verb commented Nov 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aylei left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

verb commented Nov 1, 2019 •

edited

Loading