Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Cleanup] Refactor multikueue adapter tests #2869

Conversation

mszadkow
Copy link
Contributor

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

Too much of repetitive code for tests.

Which issue(s) this PR fixes:

Relates #2552

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

@k8s-ci-robot k8s-ci-robot added the release-note-none Denotes a PR that doesn't merit a release note. label Aug 21, 2024
@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Aug 21, 2024
@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Aug 21, 2024
Copy link

netlify bot commented Aug 21, 2024

Deploy Preview for kubernetes-sigs-kueue canceled.

Name Link
🔨 Latest commit b70db3b
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-kueue/deploys/66c74e4686c557000858cf4d

@mszadkow
Copy link
Contributor Author

/ok-to-test

@k8s-ci-robot k8s-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Aug 21, 2024
@mszadkow mszadkow force-pushed the cleanup/refactor-multikueue-adapter-tests branch from ef45d3e to 52666e4 Compare August 21, 2024 09:58
@mszadkow
Copy link
Contributor Author

/retest

@mszadkow mszadkow marked this pull request as ready for review August 21, 2024 10:40
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 21, 2024
@mszadkow
Copy link
Contributor Author

@alculquicondor @tenzen-y here is the test refactor, please take a look

})
}

func finishWorkloadTestBody(wlLookupKey types.NamespacedName, finishJobReason string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func finishWorkloadTestBody(wlLookupKey types.NamespacedName, finishJobReason string) {
func waitForWorkloadToFinishAndRemoteWorkloadToBeDeleted(wlLookupKey types.NamespacedName, finishJobReason string) {

why is only the remote workload deleted, btw?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about that, I thought that's enough.
In order to be deleted in manager we would have to update the status.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mimowo @trasc could you follow up on this when you get back from vacation and add a comment into the function for the rationale?

@@ -1846,3 +1487,70 @@ var _ = ginkgo.Describe("Multikueue no GC", ginkgo.Ordered, ginkgo.ContinueOnFai
})
})
})

func jobAdmissionTestBody(acName string, wlLookupKey types.NamespacedName, admission *utiltesting.AdmissionWrapper) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func jobAdmissionTestBody(acName string, wlLookupKey types.NamespacedName, admission *utiltesting.AdmissionWrapper) {
func admitWorkloadAndCheckWorkerCopies(acName string, wlLookupKey types.NamespacedName, admission *utiltesting.AdmissionWrapper) {

@alculquicondor
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 22, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: b0337b5ada71a8a7c693382fa780647ed7d9a3c7

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor, mszadkow

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 22, 2024
@k8s-ci-robot k8s-ci-robot merged commit 6001753 into kubernetes-sigs:main Aug 22, 2024
16 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v0.9 milestone Aug 22, 2024
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My assumption was reducing the e2e test cases between all KFJob integrations because the MK adapters for the KFJobs are consolidated in https://github.com/kubernetes-sigs/kueue/blob/6001753df1c59455686cbce7143298d78b1c7cae/pkg/controller/jobs/kubeflow/kubeflowjob/kubeflowjob_multikueue_adapter.go.

So, I would recommend that we have MK e2e for KFJobs only for PyTorchJob.

This allows us to reduce e2e testing duration.

@alculquicondor Any objections about having MK e2e testing only for PyTorchJob?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as PyTorchJob is representative enough (it has more than one role, for example), I'm fine with that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use the PyTorchJob with the "Master" and "Worker" role patterns.

@mszadkow Could you take this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tenzen-y sure, that sounds ok to me.
One e2e for KFJob with PyTorchJob, but another e2e for MPIJob as it's handled with another operator, is that ok?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mszadkow Yes, that is my expectation :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mszadkow mszadkow deleted the cleanup/refactor-multikueue-adapter-tests branch August 23, 2024 06:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants