|
| 1 | +--- |
| 2 | +title: Support for CSI volume resizing |
| 3 | +authors: |
| 4 | + - "@gnufied |
| 5 | +owning-sig: sig-storage |
| 6 | +participating-sigs: |
| 7 | + - sig-storage |
| 8 | +reviewers: |
| 9 | + - @saad-ali |
| 10 | + - @jsafrane |
| 11 | +approvers: |
| 12 | + - @saad-ali |
| 13 | + - @childsb |
| 14 | +creation-date: 2019-01-29 |
| 15 | +last-updated: 2019-01-29 |
| 16 | +status: implementable |
| 17 | +see-also: |
| 18 | + - [Kubernetes Volume expansion](https://github.com/kubernetes/enhancements/issues/284) |
| 19 | + - [Online resizing design](https://github.com/kubernetes/enhancements/pull/737) |
| 20 | +replaces: |
| 21 | +superseded-by: |
| 22 | +--- |
| 23 | +
|
| 24 | +# Support for CSI volume resizing |
| 25 | +
|
| 26 | +## Table of Contents |
| 27 | +
|
| 28 | +Table of Contents |
| 29 | +================= |
| 30 | +
|
| 31 | + * [Support for CSI volume resizing](#support-for-csi-volume-resizing) |
| 32 | + * [Table of Contents](#table-of-contents) |
| 33 | + * [Table of Contents](#table-of-contents-1) |
| 34 | + * [Summary](#summary) |
| 35 | + * [Motivation](#motivation) |
| 36 | + * [Goals](#goals) |
| 37 | + * [Non-Goals](#non-goals) |
| 38 | + * [Proposal](#proposal) |
| 39 | + * [External resize controller](#external-resize-controller) |
| 40 | + * [Expansion on Kubelet](#expansion-on-kubelet) |
| 41 | + * [Offline volume resizing on kubelet:](#offline-volume-resizing-on-kubelet) |
| 42 | + * [Online volume resizing on kubelet:](#online-volume-resizing-on-kubelet) |
| 43 | + * [Risks and Mitigations](#risks-and-mitigations) |
| 44 | + * [Test Plan](#test-plan) |
| 45 | + * [Graduation Criteria](#graduation-criteria) |
| 46 | + * [Implementation History](#implementation-history) |
| 47 | +
|
| 48 | +
|
| 49 | +## Summary |
| 50 | +
|
| 51 | +To bring CSI volumes in feature parity with in-tree volumes we need to implement support for resizing of CSI volumes. |
| 52 | +
|
| 53 | +## Motivation |
| 54 | +
|
| 55 | +We recently implemented volume resizing support in CSI specs. This proposal implements this feature for Kubernetes. |
| 56 | +Any CSI volume plugin that implements necessary part of CSI specs will become resizable. |
| 57 | +
|
| 58 | +### Goals |
| 59 | +
|
| 60 | +To enable expansion of CSI volumes used by `PersistentVolumeClaim`s that support volume expansion as a plugin capability. |
| 61 | +
|
| 62 | +### Non-Goals |
| 63 | +
|
| 64 | +The expansion capability of a CSI plugin will not be validated by using CSI RPC call when user edits the PVC(i.e existing resize admission controller will not make CSI RPC call). |
| 65 | +The responsibility of |
| 66 | +actually enabling expansion for certains storageclasses still falls on Kubernetes admin. |
| 67 | +
|
| 68 | +## Proposal |
| 69 | +
|
| 70 | +The design of CSI volume resizing is made of two parts. |
| 71 | +
|
| 72 | +
|
| 73 | +### External resize controller |
| 74 | +
|
| 75 | +To support resizing of CSI volumes an external resize controller will monitor all PVCs. If a PVC meets following criteria for resizing, it will be added to |
| 76 | +controller's workqueue: |
| 77 | +
|
| 78 | +- The driver name disovered from PVC should match name of driver currently known(by querying driver info via CSI RPC call) to external resize controller. |
| 79 | +- Once it notices a PVC has been updated and by comparing old and new PVC object, it determines more space has been requested by the user. |
| 80 | +
|
| 81 | +Once PVC gets picked from workqueue, the controller will also compare requested PVC size with actual size of volume in `PersistentVolume` |
| 82 | +object. Once PVC passes all these checks, a CSI `ControllerExpandVolume` call will be made by the controller if CSI plugin implements `ControllerExpandVolume` |
| 83 | +RPC call. |
| 84 | +
|
| 85 | +If `ControllerExpandVolume` call is successful and plugin implements `NodeExpandVolume`: |
| 86 | +- if `ControllerExpandVolumeResponse` returns `true` in `node_expansion_required` then `FileSystemResizePending` condition will be added to PVC and `NodeExpandVolume` operation will be queued on kubelet. Also volume size reported by PV will be updated to new value. |
| 87 | +- if `ControllerExpandVolumeResponse` returns `false` in `node_expansion_required` then volume resize operation will be marked finished and both `pvc.Status.Capacity` and `pv.Spec.Capacity` will report updated value. |
| 88 | +
|
| 89 | +If plugin does not implement `NodeExpandVolume` then volume resize operation will be marked as finished and both `pvc.Status.Capacity` and `pv.Spec.Capacity` will report updated value after successful completion of `ControllerExpandVolume` RPC call. |
| 90 | +
|
| 91 | +If `ControllerExpandVolume` call fails: |
| 92 | +- Then PVC will retain `Resizing` condition and will have appropriate events added to the PVC. |
| 93 | +- Controller will retry resizing operation with exponential backoff, assuming it corrects itself. |
| 94 | +
|
| 95 | +A general mechanism for recovering from resize failure will be implemented via: https://github.com/kubernetes/kubernetes/issues/73036 |
| 96 | +
|
| 97 | +### Expansion on Kubelet |
| 98 | +
|
| 99 | +A CSI volume may require expansion on the node to finish volume resizing. In some cases - the entire resizing operation can happen on the node and |
| 100 | +plugin may choose to not implement `ControllerExpandVolume` CSI RPC call at all. |
| 101 | +
|
| 102 | +Currently Kubernetes supports two modes of performing volume resize on kubelet. We will describe each mode here. For more information , please refer to original volume resize proposal - https://github.com/kubernetes/community/blob/master/contributors/design-proposals/storage/grow-volume-size.md. |
| 103 | +
|
| 104 | +
|
| 105 | +#### Offline volume resizing on kubelet: |
| 106 | +
|
| 107 | +This is the default mode and in this mode `NodeExpandVolume` will only be called when volume is being mounted on the node. In other words, pod that was using the volume must be re-created for expansion on node to happen. |
| 108 | +
|
| 109 | +When a pod that is using the PVC is started, kubelet will compare `pvc.spec.resources.requests.storage` and `pvc.Status.Capacity`. It also compares PVC's size with `pv.Spec.Capacity` and if it detects PV is reporting same size as pvc's spec but PVC's status is still reporting smaller value then it determines - |
| 110 | +a volume expansion is pending on the node. At this point if plugin implements `NodeExpandVolume` RPC call then, kubelet will call it and: |
| 111 | +
|
| 112 | +If `NodeExpandVolume` is successful: |
| 113 | +- It will update `pvc.Status.Capacity` with latest value and remove all resizing related conditions from PVC. |
| 114 | +
|
| 115 | +If `NodeExpandVolume` failed: |
| 116 | +- It will add a event to both PVC and Pod about failed resizing and resize operation will be retried. This will prevent pod from starting up. |
| 117 | +
|
| 118 | +
|
| 119 | +#### Online volume resizing on kubelet: |
| 120 | +
|
| 121 | +More details about online resizing can be found in [Online resizing design](https://github.com/kubernetes/enhancements/pull/737) but essentially if |
| 122 | +`ExpandInUsePersistentVolumes` feature is enabled then kubelet will periodically poll all PVCs that are being used on the node and compare `pvc.spec.resources.requests.storage` and `pvc.Status.Capacity`(also `pv.Spec.Capacity`) and make similar determination about whether node expansion is required for the volume. |
| 123 | +
|
| 124 | +In this mode `NodeExpandVolume` can be called while pod is running and volume is in-use. Using aformentioned check if kubelet determines that |
| 125 | +volume expansion is needed on the node and plugin implements `NodeExpandVolume` RPC call then, kubelet will call it(provided volume has already been node staged and published on the node) and: |
| 126 | +
|
| 127 | +If `NodeExpandVolume` is successful: |
| 128 | +- It will update `pvc.Status.Capacity` with latest value and remove all resizing related conditions from PVC. |
| 129 | +
|
| 130 | +If `NodeExpandVolume` failed: |
| 131 | +- It will add a event to both PVC and Pod about failed resizing and resize operation will be retried. |
| 132 | +
|
| 133 | +### Risks and Mitigations |
| 134 | +
|
| 135 | +Before this feature goes GA - we need to handle recovering https://github.com/kubernetes/kubernetes/issues/73036. |
| 136 | +
|
| 137 | +## Test Plan |
| 138 | +
|
| 139 | +* Unit tests for external resize controller. |
| 140 | +* Add e2e tests in Kubernetes that use csi-mock driver for volume resizing. |
| 141 | + - (postive) Give a plugin that supports both control plane and node size resize, CSI volume should be resizable and able to complete successfully. |
| 142 | + - (positive) Given a plugin that only requires control plane resize, CSI volume should be resizable and able to complete successfully. |
| 143 | + - (positive) Given a plugin that only requires node side resize, CSI volume should be resizable and able to complete successfully. |
| 144 | + - (positive) Given a plugin that support online resizing, CSI volume should be resizable and online resize operation be able to complete successfully. |
| 145 | + - (negative) If control resize fails, PVC should have appropriate events. |
| 146 | + - (neative) if node side resize fails, both pod and PVC should have appropriate events. |
| 147 | +
|
| 148 | +## Graduation Criteria |
| 149 | +
|
| 150 | +Once implemented CSI volumes should be resizable and in-line with current in-tree implementation of volume resizing. |
| 151 | +
|
| 152 | +- *Alpha* : Initial support for CSI volume resizing. Released code will include an external CSI volume resize controller and changes to Kubelet. Implementation will have unit tests and csi-mock driver e2e tests. |
| 153 | +- *Beta* : More robust support for CSI volume resizing, handle recovering from resize failures. Add e2e tests that use real drivers(`gce-pd`, `ebs` at minimum). Add metrics for volume resize operations. |
| 154 | +- *GA* : CSI resizing in general will only leave GA after existing [Volume expansion](https://github.com/kubernetes/enhancements/issues/284) feature leaves GA. Online resizing of CSI volumes depends on [Online resizing](https://github.com/kubernetes/enhancements/pull/737) feature and online resizing of CSI volumes will be available as a GA feature only when [Online resizing feature](https://github.com/kubernetes/enhancements/pull/737) goes GA. |
| 155 | +
|
| 156 | +Hopefully the content previously contained in [umbrella issues][] will be tracked in the `Graduation Criteria` section. |
| 157 | +
|
| 158 | +[umbrella issues]: https://github.com/kubernetes/kubernetes/issues/62096 |
| 159 | +
|
| 160 | +## Implementation History |
| 161 | +
|
| 162 | +Major milestones in the life cycle of a KEP should be tracked in `Implementation History`. |
| 163 | +Major milestones might include |
| 164 | +
|
| 165 | +- the `Summary` and `Motivation` sections being merged signaling SIG acceptance |
| 166 | +- the `Proposal` section being merged signaling agreement on a proposed design |
| 167 | +- the date implementation started |
| 168 | +- the first Kubernetes release where an initial version of the KEP was available |
| 169 | +- the version of Kubernetes where the KEP graduated to general availability |
| 170 | +- when the KEP was retired or superseded |
0 commit comments