Allow custom tolerations with operator #1617

cosandr · 2023-12-08T18:49:22Z

Hi,

I'd like to deploy the gpu plugin with custom tolerations, i.e.

tolerations:
  - operator: "Exists"
    key: node-role.kubernetes.io/control-plane
    effect: NoSchedule

but this doesn't appear to be possible. The CRD supports node selectors but not tolerations.

My workaround is deploying the daemonset with kustomize and patching it afterwards to have my tolerations, but I'd like to switch to the operator if possible. I admit I haven't tried to patch the daemonset deployed by the operator, I assumed that's a bad idea and that it would eventually replace it.

tkatila · 2023-12-11T06:57:15Z

Hi @cosandr and thanks for the issue!

For some plugins we support annotations in the CR, this would be similar and definitely doable.

eero-t · 2023-12-11T09:31:52Z

@cosandr Are you running GPU workloads on control plane node in production, or is this just for being able to test things with single-node setup?

As that seems quite uncommon practice, do you have any other use-case where tolerations would be useful?

cosandr · 2023-12-11T09:53:13Z

@cosandr Are you running GPU workloads on control plane node in production, or is this just for being able to test things with single-node setup?

As that seems quite uncommon practice, do you have any other use-case where tolerations would be useful?

The example is not from production, no. I would say it's relatively common to taint specialized nodes (for example the nvidia.com/gpu:NoSchedule taint is added by some cloud providers by default), it's conceivable someone would want to do a similar thing with Intel accelerators as well.

eero-t · 2023-12-11T10:01:28Z

Ok, that's a really good use-case, and NFD actually supports tainting nodes with specific devices.

I think we would want to support such option in the operator too:

Adding NFD rule to taint nodes with given device type
Adding toleration for that taint to corresponding device plugin deployment

@tkatila, any comments?

EDIT: I think it should be a per-plugin option, as some nodes might have multiple device types, and multiple different taints per node could be awkward.

mythi · 2023-12-11T10:05:51Z

Ok, that's a really good use-case, and NFD actually supports tainting nodes with specific devices.

#1571 discussed this area too. perhaps we need to think through the cases

tkatila · 2023-12-11T13:34:16Z

Adding NFD rule to taint nodes with given device type

I don't understand this. Can you clarify?

Adding toleration for that taint to corresponding device plugin deployment
EDIT: I think it should be a per-plugin option, as some nodes might have multiple device types, and multiple different taints per node could be awkward.

Yup, making it per CR seems like a good solution to me.

eero-t · 2023-12-11T13:56:16Z

It's experimental NFD feature: https://nfd.sigs.k8s.io/usage/customization-guide#node-tainting

EDIT: Because it's still experimental, needs NFD to run with enabling flag, and NFD worker would need also toleration, it may be better to start just by supporting user specified toleration in the operator (and adding NFD node tainting once NFD has support for that enabled by default).

tkatila · 2023-12-11T15:21:30Z

It's experimental NFD feature: https://nfd.sigs.k8s.io/usage/customization-guide#node-tainting

NFD worker will then need toleration for that taint too though...

We can try them out, document the use and maybe create examples. But I would keep them as optional/advanced scenarios.

mythi · 2023-12-12T06:11:01Z

I admit I haven't tried to patch the daemonset deployed by the operator, I assumed that's a bad idea and that it would eventually replace it.

@cosandr the operator takes the daemonset "base" during compile-time (see deployments/daemonsets.go). if you edit the base daemonset and build your custom operator for testing purposes, it should work without risks getting the actual plugin deployment overwritten.

winromulus · 2024-01-18T21:18:12Z

We also have other taints we put on certain nodes to restrict scheduling for specific workloads. Adding tolerations is a must since the device is present on those nodes.

tkatila · 2024-02-16T12:47:11Z

@cosandr & @winromulus question or concern about this request. By the node having a taint and the plugin having a toleration, it would also mean that the workloads would require the same toleration. Compared to just having the resource request, it feels bad from an user experience point of view.

Is this something you'd be fine with?

winromulus · 2024-02-16T13:06:17Z

@tkatila this is actually very much intended. If you need a node to run only certain workloads, you apply taints and give the workloads tolerations.
I'll give a practical example: If I have a intel GPU only node and don't want any other kind of workload to be scheduled on that node, I apply the taint to the node and have the plugin and workload have tolerations. (This cannot be achieved with node affinity or alternatives because it does not prevent daemon sets or others from being scheduled to that node).
If you check the nvidia operator, it has full toleration to ANY taints, specifically for this reason. The plugin should start on any node where devices are found regardless of taints and the workloads can set their own tolerations to target that node.

tkatila · 2024-02-16T19:21:47Z

Thanks @winromulus

So to summary: run GPU plugin on all nodes with GPU hardware, regardless of the taints. Workloads request the GPU resource + have toleration(s) for the tainted node.

I will look into adding the tolerations support to the operator.

tkatila added enhancement New feature or request operator Device operator related issue labels Dec 11, 2023

tkatila mentioned this issue Mar 12, 2024

Operator: add support for tolerations and add differentiation labels #1686

Merged

mythi closed this as completed in #1686 Mar 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow custom tolerations with operator #1617

Allow custom tolerations with operator #1617

cosandr commented Dec 8, 2023

tkatila commented Dec 11, 2023

eero-t commented Dec 11, 2023

cosandr commented Dec 11, 2023

eero-t commented Dec 11, 2023 •

edited

Loading

mythi commented Dec 11, 2023

tkatila commented Dec 11, 2023

eero-t commented Dec 11, 2023 •

edited

Loading

tkatila commented Dec 11, 2023

mythi commented Dec 12, 2023

winromulus commented Jan 18, 2024

tkatila commented Feb 16, 2024

winromulus commented Feb 16, 2024

tkatila commented Feb 16, 2024

Allow custom tolerations with operator #1617

Allow custom tolerations with operator #1617

Comments

cosandr commented Dec 8, 2023

tkatila commented Dec 11, 2023

eero-t commented Dec 11, 2023

cosandr commented Dec 11, 2023

eero-t commented Dec 11, 2023 • edited Loading

mythi commented Dec 11, 2023

tkatila commented Dec 11, 2023

eero-t commented Dec 11, 2023 • edited Loading

tkatila commented Dec 11, 2023

mythi commented Dec 12, 2023

winromulus commented Jan 18, 2024

tkatila commented Feb 16, 2024

winromulus commented Feb 16, 2024

tkatila commented Feb 16, 2024

eero-t commented Dec 11, 2023 •

edited

Loading

eero-t commented Dec 11, 2023 •

edited

Loading