Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fleet autoscaler with "List" policy throws an error if configured with a fleet with no replicas #3943

Open
geopaul-nm opened this issue Aug 12, 2024 · 4 comments
Labels
kind/bug These are bugs.

Comments

@geopaul-nm
Copy link

What happened:

If we setup a fleet without defining "replicas" and setup an autoscaler with "List" policy, the autoscaler throws the following error.

Error calculating desired fleet size on FleetAutoscaler simple-game-server-autoscaler. Error: cannot apply ListPolicy as List key MyCustomList does not exist in the Fleet Status

What you expected to happen:

The fleet autoscaler should be able to determine the desired number of replicas even if there is no replicas present.

How to reproduce it (as minimally and precisely as possible):

Setup a fleet with no replicas defined, and a list based autoscaler for the fleet.
Ex.

---
apiVersion: agones.dev/v1
kind: Fleet
metadata:
  name: simple-game-server
spec:
  template:
    spec:
      lists:
        MyCustomList:
          capacity: 50
      ports:
        - name: default
          containerPort: 7654
      template:
        spec:
          containers:
            - name: simple-game-server
              image: us-docker.pkg.dev/agones-images/examples/simple-game-server:0.34
              resources:
                requests:
                  memory: 64Mi
                  cpu: 20m
                limits:
                  memory: 64Mi
                  cpu: 20m
---
apiVersion: autoscaling.agones.dev/v1
kind: FleetAutoscaler
metadata:
  name: simple-game-server-autoscaler
spec:
  fleetName: simple-game-server
  policy:
    type: List  # List based autoscaling.
    list:
      key: MyCustomList
      bufferSize: 10%
      minCapacity: 50
      maxCapacity: 800

Below is the describe output of the fleet autoscaler

Name:         simple-game-server-autoscaler
Namespace:    default
Labels:       <none>
Annotations:  <none>
API Version:  autoscaling.agones.dev/v1
Kind:         FleetAutoscaler
Metadata:
  Creation Timestamp:  2024-08-12T14:35:58Z
  Generation:          1
  Resource Version:    28009
  UID:                 91ff6589-e3c4-4456-b5b4-e31767a8ac03
Spec:
  Fleet Name:  simple-game-server
  Policy:
    List:
      Buffer Size:   10%
      Key:           MyCustomList
      Max Capacity:  800
      Min Capacity:  50
    Type:            List
  Sync:
    Fixed Interval:
      Seconds:  30
    Type:       FixedInterval
Events:
  Type     Reason           Age                 From                        Message
  ----     ------           ----                ----                        -------
  Warning  FleetAutoscaler  12s (x25 over 76s)  fleetautoscaler-controller  Error calculating desired fleet size on FleetAutoscaler simple-game-server-autoscaler. Error: cannot apply ListPolicy as List key MyCustomList does not exist in the Fleet Status

Anything else we need to know?:
This is crucial for agones to be used with "Flux CD" (https://fluxcd.io/) and probably other gitops based tools.

Environment:

  • Agones version: 1.42
  • Kubernetes version (use kubectl version): 1.27.1
  • Cloud provider or hardware configuration: Minikube
  • Install method (yaml/helm): helm
  • Troubleshooting guide log(s):
  • Others:
@geopaul-nm geopaul-nm added the kind/bug These are bugs. label Aug 12, 2024
@markmandel
Copy link
Member

@igooch this seems like it's in your wheelhouse?

I'm wondering if there isn't a replica count, there are no count and/or list values set in the status values?

@igooch
Copy link
Collaborator

igooch commented Aug 20, 2024

@igooch this seems like it's in your wheelhouse?

I'm wondering if there isn't a replica count, there are no count and/or list values set in the status values?

Oh yep, we when implementing this we made the assumption that the Fleet can't scale to 0 replicas, so it also can't scale from 0 replicas.

Currently the list / counter fleet status goes from game server status -> game server set aggregation -> fleet status aggregation. So, no game servers, no aggregation. We could possibly add in a line to create the status from the somewhere in the replica set controller to create empty lists / counters with 0 capacity if there are no game servers on that game server set:

// Aggregates all Counters and Lists only for GameServer all states (except IsBeingDeleted)
if runtime.FeatureEnabled(runtime.FeatureCountsAndLists) {
status.Counters = aggregateCounters(status.Counters, gs.Status.Counters, gs.Status.State)
status.Lists = aggregateLists(status.Lists, gs.Status.Lists, gs.Status.State)
}

@igooch
Copy link
Collaborator

igooch commented Aug 21, 2024

@ashutosji can you take a look and see if this would just require a change to the game server set controller, or if it would also require a change to the autoscaler as well?

@kamaljeeti
Copy link
Contributor

kamaljeeti commented Sep 25, 2024

Hi @igooch ,
I think list of GameServer will be empty in the case of Fleet can't scale to 0 replicas. We can have check if list of GameServer is empty and FeatureCountsAndLists feature is enabled in that case we can initialize empty list or counter with zero capacity.
Something like this:

    if len(list) == 0 {
        if runtime.FeatureEnabled(runtime.FeatureCountsAndLists) {
            status.Counters = make(map[string]agonesv1.AggregatedCounterStatus)
            status.Lists = make(map[string]agonesv1.AggregatedListStatus)

            status.Lists["MyCustomList"] = agonesv1.AggregatedListStatus{
                Count:    0,
                Capacity: 0,
            }
        }
        return status
    }

How can we get list of keys from Fleet spec? is the above approach seems reasonable.
WDYT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug These are bugs.
Projects
None yet
Development

No branches or pull requests

4 participants