Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edge case around timeout #35

Open
macobo opened this issue Jul 8, 2021 · 3 comments
Open

Edge case around timeout #35

macobo opened this issue Jul 8, 2021 · 3 comments

Comments

@macobo
Copy link
Contributor

macobo commented Jul 8, 2021

Thanks again for creating this action! 🚀 We've been using at posthog for our own helm chart to great success.

However we had been having difficulty with flaky tests due to timeouts. Since the stack is quite involved, it can take >10m to boot on k3s on github actions sometimes. We found that when our helm chart took longer, this action always timed out at 10 minutes with the following error:

Error: Action failure: rollout of deploy/posthog-web
The process '/usr/local/bin/kubectl' failed with exit code 1

error: deployment "posthog-web" exceeded its progress deadline

The underlying cause is this:

[Deployments] .spec.progressDeadlineSeconds is an optional field that specifies the number of seconds you want to wait for your Deployment to progress before the system reports back that the Deployment has failed progressing - surfaced as a condition with Type=Progressing, Status=False. and Reason=ProgressDeadlineExceeded in the status of the resource. The Deployment controller will keep retrying the Deployment. This defaults to 600.

From https://kubernetes.io/docs/concepts/workloads/controllers/deployment/

In practice this means that since this action uses kubectl rollout status, under the hood it times out when going above 600 seconds.

I've patched this in our chart by explicitly setting spec.progressDeadlineSeconds (PostHog/charts-clickhouse#40) but thought it's worth raising here as well. Is this something this action should handle implicitly?

cc @consideRatio

@welcome
Copy link

welcome bot commented Jul 8, 2021

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! 🤗

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
welcome
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! 👋

Welcome to the Jupyter community! 🎉

@macobo macobo changed the title Edge case around --timeout Edge case around timeout Jul 8, 2021
@consideRatio
Copy link
Member

@macobo thanks for your kind feedback and sharing your insights about this - today I learned about this for the first time!

Hmm hmmm I'm thinking:

  • it would be reasonable for this action to fail if a Deployment is configured with spec.progressDeadlineSeconds and ends up timing out
  • it would be unreasonable for this action to make any change to the k8s resources, and it could make sense for whoever manages the resources to ensure progressDeadlineSeconds is set to a value so it won't occur.
  • it would add significant complexity for this action to always wait a given time no matter if progressDeadlineSeconds becomes the limiting factor, and it's doesn't feel obvious it would be desired
  • it could make sense to add a note someplace in the documentation about this, or just having this issue findable as it now is via search enginges and github search

What do you think @macobo? Do you have a suggestion on a change to pursue in this repo?

@macobo
Copy link
Contributor Author

macobo commented Jul 8, 2021

Agreed with everything you're saying. I think the issue might be good enough of a MVP though a note in the README.md under gotchas wouldn't hurt. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants