Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor how we launch spot instances #366

Merged
merged 5 commits into from
Nov 21, 2023
Merged

Conversation

nchammas
Copy link
Owner

This PR refactors how we launch spot instances to migrate away from RequestSpotInstances, which is deprecated.

Instead, we now use the same method -- RunInstances -- whether we are launching on-demand or spot instances. This simplifies the launch code.

This PR also deprecates the --ec2-spot-request-duration option, which does not apply to one-time spot instances (as opposed to persistent spot instances). The distinction between one-time vs. persistent did not exist in the RequestSpotInstances API. However, one-time spot instances map most closely to Flintrock's existing handling of spot instances.

In the future, as previously mentioned here, it would be good to somehow delegate Flintrock's handling of all the various EC2 options down to boto3, instead of exposing individual options and having to track changes to the EC2 API. That wouldn't have spared us this migration from RequestSpotInstances to RunInstances, but it would let users get more of what EC2 can do out of the box without requiring Flintrock to explicitly enable it. But that's a future PR.

I tested this PR by running the full test suite, including acceptance tests, as well as manual tests of launching spot instance clusters and add spot nodes to existing clusters.

Related to #315.
Fixes #364.

@nchammas
Copy link
Owner Author

cc @mblackgeo - If you're still using Flintrock, I'm curious to know if you agree with my summary of why we don't need --ec2-spot-request-duration anymore. (Asking since you added this feature in #315.)

@nchammas
Copy link
Owner Author

cc @maxpoulain

@mblackgeo
Copy link
Contributor

cc @mblackgeo - If you're still using Flintrock, I'm curious to know if you agree with my summary of why we don't need --ec2-spot-request-duration anymore. (Asking since you added this feature in #315.)

Hey thanks for looping me in! Yes we indeed still use Flintrock and with this change to one-time requests it makes sense that the spot request duration is no longer needed. I assume that a Ctrl + C when launching instances with Flintrock will cancel the boto/ec2 request?

@nchammas
Copy link
Owner Author

That is correct, Flintrock will cancel the request and terminate any instances that may have launched as part of the same operation. If you find a case where that doesn't happen, that's a bug.

@nchammas nchammas merged commit 0a7821b into master Nov 21, 2023
9 checks passed
@nchammas nchammas deleted the spot-instance-refactor branch November 21, 2023 14:16
@maxpoulain
Copy link

Hi !

Thanks for the cc !

I just have a question and I haven't found any information in boto documentation..

As ec2.create_instances is going to be used for spot instances creation and ec2.create_instances returns a list of instances. What's happen if there is not enough available instances for example ? Is ec2.create_instances going to return provided instances even if it's not the number of instances wanted ? or ec2.create_instances is going to wait until all instances are available ? or ec2.create_instances is just going to fail ?

@nchammas
Copy link
Owner Author

Good question. It will fail relatively quickly with a message like this:

$ flintrock launch test
Launching 1 master and 1 slave...
An error occurred (InsufficientInstanceCapacity) when calling the RunInstances operation (reached max 
  retries: 4): There is no Spot capacity available that matches your request.
No cluster test in region us-east-1.

I believe this is due to how the underlying RunInstances API works, which seems cleaner than the deprecated RequestSpotInstances and also eliminates the need for a spot request duration.

Does this match your expectations, or were you hoping for different behavior?

@maxpoulain
Copy link

Yes, it meets our expectations perfectly, it's better/cleaner than using RequestSpotInstances !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Spot instance tags issue
3 participants