Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Job slicing does not respect host limit, leading to unnecessary job failures #15589

Open
4 of 11 tasks
jangel97 opened this issue Oct 18, 2024 · 1 comment
Open
4 of 11 tasks

Comments

@jangel97
Copy link

Please confirm the following

  • I agree to follow this project's code of conduct.
  • I have checked the current issues for duplicates.
  • I understand that AWX is open source software provided for free and that I might not receive a timely response.
  • I am NOT reporting a (potential) security vulnerability. (These should be emailed to [email protected] instead.)

Bug Summary

I have a job template that runs against approximately 50 hosts. To speed up execution, I've set Job Slicing to 5, which effectively runs 5 parallel jobs against 10 hosts each.

However, when I run this job template with a limit set to target only 2 hosts, the job still creates 5 slices. As a result:

  • 1 slice succeeds (processing the 2 limited hosts).
  • 4 slices fail because they have no hosts to run against.

This behavior causes the overall job to report failures, and since I have alerting configured, I receive unnecessary failure notifications.

My deployment of AAP is in VMs.

AWX version

AAP 2.4

Select the relevant components

  • UI
  • UI (tech preview)
  • API
  • Docs
  • Collection
  • CLI
  • Other

Installation method

N/A

Modifications

no

Ansible version

No response

Operating system

RHEL 8

Web browser

No response

Steps to reproduce

  1. Create a job template with Job Slicing set to 5.
  2. Set a limit to target a small number of hosts (e.g., 2 hosts).
  3. Launch the job template.
  4. Observe that multiple slices fail due to having no hosts assigned.

Expected results

When a limit is set that results in fewer hosts than the number of slices, AWX should adjust the number of slices accordingly. In this case, it should:

  • Create only 1 slice to handle the 2 hosts.
  • Avoid creating additional slices that have no hosts to process.

Actual results

  • AWX creates the maximum number of slices specified (5 in this case), regardless of the number of hosts after applying the limit.
  • Slices without any hosts assigned fail, impacting the overall job status.

Additional information

Perhaps this is fixed in AAP 2.5?

@jangel97
Copy link
Author

I am willing to discuss about implementation details and create PR for this, if it makes sense to fix this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant