Skip to content

Commit 92992a8

Browse files
committed
updating sysctls for 3.11
1 parent 922cc9a commit 92992a8

File tree

1 file changed

+63
-44
lines changed

1 file changed

+63
-44
lines changed

admin_guide/sysctls.adoc

Lines changed: 63 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -14,18 +14,18 @@ toc::[]
1414

1515
Sysctl settings are exposed via Kubernetes, allowing users to modify certain
1616
kernel parameters at runtime for namespaces within a container. Only sysctls
17-
that are namespaced can be set independently on pods; if a sysctl is not
18-
namespaced (called _node-level_), it cannot be set within {product-title}.
19-
Moreover, only those sysctls considered _safe_ are whitelisted by default; other
20-
_unsafe_ sysctls can be manually enabled on the node to be available to the
17+
that are namespaced can be set independently on pods. If a sysctl is not
18+
namespaced, called _node-level_, it cannot be set within {product-title}.
19+
Moreover, only those sysctls considered _safe_ are whitelisted by default; you
20+
can manually enable other _unsafe_ sysctls on the node to be available to the
2121
user.
2222

2323
[[undersatnding-sysctls]]
24-
== Understanding Sysctls
24+
== Understanding sysctls
2525

2626
In Linux, the sysctl interface allows an administrator to modify kernel
2727
parameters at runtime. Parameters are available via the *_/proc/sys/_* virtual
28-
process file system. The parameters cover various subsystems such as:
28+
process file system. The parameters cover various subsystems, such as:
2929

3030
- kernel (common prefix: *_kernel._*)
3131
- networking (common prefix: *_net._*)
@@ -40,10 +40,10 @@ $ sudo sysctl -a
4040
----
4141

4242
[[namespaced-vs-node-level-sysctls]]
43-
== Namespaced Versus Node-Level Sysctls
43+
== Namespaced versus node-level sysctls
4444

4545
A number of sysctls are _namespaced_ in today’s Linux kernels. This means that
46-
they can be set independently for each pod on a node. Being namespaced is a
46+
you can set them independently for each pod on a node. Being namespaced is a
4747
requirement for sysctls to be accessible in a pod context within Kubernetes.
4848

4949
The following sysctls are known to be namespaced:
@@ -56,63 +56,64 @@ The following sysctls are known to be namespaced:
5656

5757
Sysctls that are not namespaced are called _node-level_ and must be set
5858
manually by the cluster administrator, either by means of the underlying Linux
59-
distribution of the nodes (e.g., via *_/etc/sysctls.conf_*) or using a DaemonSet
60-
with privileged containers.
59+
distribution of the nodes, such as by modifying the *_/etc/sysctls.conf_* file,
60+
or by using a DaemonSet with privileged containers.
6161

6262
[NOTE]
6363
====
6464
Consider marking nodes with special sysctls as tainted. Only schedule pods onto
6565
them that need those sysctl settings. Use the
66-
link:http://kubernetes.io/docs/user-guide/kubectl/kubectl_taint/[Kubernetes _taints and toleration_ feature] to implement this.
66+
xref:../admin_guide/scheduling/taints_tolerations.adoc#admin-guide-taints[taints
67+
and toleration feature] to mark the nodes.
6768
====
6869

6970
[[safe-vs-unsafe-sysclts]]
70-
== Safe Versus Unsafe Sysctls
71+
== Safe versus unsafe sysctls
7172

7273
Sysctls are grouped into _safe_ and _unsafe_ sysctls. In addition to proper
7374
namespacing, a safe sysctl must be properly isolated between pods on the same
74-
node. This means that setting a safe sysctl for one pod:
75+
node. This means that if you set a sysctl as safe for one pod it must not:
7576

76-
- must not have any influence on any other pod on the node,
77-
- must not allow to harm the node's health, and
78-
- must not allow to gain CPU or memory resources outside of the resource limits of
79-
a pod.
77+
- Influence any other pod on the node
78+
- Harm the node's health
79+
- Gain CPU or memory resources outside of the resource limits of a pod
8080

8181
By far, most of the namespaced sysctls are not necessarily considered safe.
8282

83-
For {product-title} 3.3.1, the following sysctls are supported (whitelisted) in
84-
the safe set:
83+
Currently, {product-title} supports, or whitelists, the following sysctls
84+
in the safe set:
8585

8686
- *_kernel.shm_rmid_forced_*
8787
- *_net.ipv4.ip_local_port_range_*
88+
- *_net.ipv4.tcp_syncookies_*
8889

89-
This list will be extended in future versions when the kubelet supports better
90+
This list might be extended in future versions when the kubelet supports better
9091
isolation mechanisms.
9192

9293
All safe sysctls are enabled by default. All unsafe sysctls are disabled by
93-
default and must be allowed manually by the cluster administrator on a per-node
94-
basis. Pods with disabled unsafe sysctls will be scheduled, but will fail to
94+
default, and the cluster administrator must manually enable them on a per-node
95+
basis. Pods with disabled unsafe sysctls will be scheduled but will fail to
9596
launch.
9697

98+
[[enabling-unsafe-sysctls]]
99+
== Enabling unsafe sysctls
100+
101+
The cluster administrator can allow certain unsafe sysctls for very special
102+
situations such as high-performance or real-time application tuning.
103+
104+
If you want to use unsafe sysctls, cluster administrators must enable them
105+
individually on nodes. They can enable only namespaced sysctls.
106+
97107
[WARNING]
98108
====
99109
Due to their nature of being unsafe, the use of unsafe sysctls is
100110
at-your-own-risk and can lead to severe problems like wrong behavior of
101111
containers, resource shortage, or complete breakage of a node.
102112
====
103113

104-
[[enabling-unsafe-sysctls]]
105-
== Enabling Unsafe Sysctls
106-
107-
With the warning above in mind, the cluster administrator can allow certain
108-
unsafe sysctls for very special situations, e.g., high-performance or real-time
109-
application tuning.
110-
111-
If you want to use unsafe sysctls, cluster administrators must enable them
112-
individually on nodes. Only namespaced sysctls can be enabled this way.
113-
114-
. Specify the unsafe sysctls to use as the value of the `kubeletArguments`\ parameter in the appropriate xref:../admin_guide/manage_nodes.adoc#modifying-nodes[node configuration map]
115-
file, as described in xref:../admin_guide/manage_nodes.adoc#configuring-node-resources[Configuring Node Resources]:
114+
. Use the `*kubeletArguments*` field in the *_/etc/origin/node/node-config.yaml_*
115+
file, as described in
116+
xref:../admin_guide/manage_nodes.adoc#configuring-node-resources[Configuring Node Resources], to set the desired unsafe sysctls:
116117
+
117118
----
118119
kubeletArguments:
@@ -134,31 +135,49 @@ ifdef::openshift-origin[]
134135
endif::[]
135136

136137
[[setting-sysctls-for-a-pod]]
137-
== Setting Sysctls for a Pod
138+
== Setting sysctls for a pod
139+
140+
Sysctls are set on pods using the pod's `securityContext`. The `securityContext`
141+
applies to all containers in the same pod.
142+
143+
The following example uses the pod `securityContext` to set a safe sysctl
144+
`kernel.shm_rmid_forced` and two unsafe sysctls, `net.ipv4.route.min_pmtu` and
145+
`kernel.msgmax`. There is no distinction between _safe_ and _unsafe_ sysctls in
146+
the specification.
138147

139-
Sysctls are set on pods using annotations. They apply to all containers in the
140-
same pod.
148+
[WARNING]
149+
====
150+
To avoid destabilizing your operating system, modify sysctl parameters only
151+
after you understand their effects.
152+
====
141153

142-
Here is an example, with different annotations for safe and unsafe sysctls:
154+
Modify the YAML file that defines the pod and add the `securityContext` spec, as
155+
shown in the following example:
143156

157+
[source,yaml]
144158
----
145159
apiVersion: v1
146160
kind: Pod
147161
metadata:
148162
name: sysctl-example
149-
annotations:
150-
security.alpha.kubernetes.io/sysctls: kernel.shm_rmid_forced=1
151-
security.alpha.kubernetes.io/unsafe-sysctls: net.ipv4.route.min_pmtu=1000,kernel.msgmax=1 2 3
152163
spec:
164+
securityContext:
165+
sysctls:
166+
- name: kernel.shm_rmid_forced
167+
value: "0"
168+
- name: net.ipv4.route.min_pmtu
169+
value: "552"
170+
- name: kernel.msgmax
171+
value: "65536"
153172
...
154173
----
155174

156175
[NOTE]
157176
====
158177
A pod with the unsafe sysctls specified above will fail to launch on any node
159-
that has not enabled those two unsafe sysctls explicitly. As with node-level
160-
sysctls, use the
161-
link:http://kubernetes.io/docs/user-guide/kubectl/kubectl_taint[taints and
178+
that the admin has not explicitly enabled those two unsafe sysctls. As with
179+
node-level sysctls, use the
180+
xref:../admin_guide/scheduling/taints_tolerations.adoc#admin-guide-taints[taints and
162181
toleration feature] or
163182
xref:../admin_guide/manage_nodes.adoc#updating-labels-on-nodes[labels on nodes]
164183
to schedule those pods onto the right nodes.

0 commit comments

Comments
 (0)