Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The center side scheduling results and device allocation results make dp aware of lightweight methods #2194

Open
ferris-cx opened this issue Sep 4, 2024 · 2 comments
Labels
kind/proposal Create a report to help us improve

Comments

@ferris-cx
Copy link

Since both the scheduling result and the assignment device are implemented in the scheduling plug-in, the scheduling result is marked in the pod Annotations. And dp is not aware of scheduling results. When Pod is created, Kubelet calls the Allocate method of dp. Since kubelet's native code is not aware of the scheduling result, it is kubelet itself that calculates the device ID according to the allocation algorithm. The device ID on Annotations cannot be obtained based on PodName (the scheduler has assigned it). For the above reasons, a node lock scheme can be considered: Verify that the Pod has device resource application. If so, write the node lock in the Bind phase of the Pod to ensure that only one Pod preempt the lock. The dp side can query the Pod that is Pennding on the current Node, and analyze the device ID list assigned by the center side from the Pod Annotations. Further processing is performed and the AllocateResponse return is finally built. In the GPU scheduling scenario, a server has a maximum of eight GPU cards, so the number of Pods for each server is small, and the performance loss caused by frequent creation is small. Therefore, this option can be considered.
The general code idea:

  1. On the dispatch center side, after successful scheduling and assignment results are available, call in Bind:
    current, err := kubeClient.CoreV1().Pods(args.PodNamespace).Get(context.Background(), args.PodName, metav1.GetOptions{})// Gets the current pod object
    LockNode(node, current)// Adds a node lock
  2. Allocate(ctx context.Context, Reqs * kubeletdevicepluginv1beta1 AllocateRequest) (* kubeletdevicepluginv1beta1 AllocateResponse, error) method:
    current, err := util.GetPendingPod(nodename)// Traverses all Pods of the current Node and finds the Pending Pods
  3. Resolve the assigned device ID according to current.Annotations.

Please consider this solution and if feasible, consider implementing it

@ferris-cx ferris-cx added the kind/proposal Create a report to help us improve label Sep 4, 2024
@ZiMengSheng
Copy link
Contributor

Will the Scheduling Binding Cycle wait or failed if it can't get the lock.

@ferris-cx
Copy link
Author

Will the Scheduling Binding Cycle wait or failed if it can't get the lock.

Means binding failed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/proposal Create a report to help us improve
Projects
None yet
Development

No branches or pull requests

2 participants