Skip to content
This repository has been archived by the owner on Dec 15, 2022. It is now read-only.

GKE cluster becomes unresponsive when provider installed #39

Open
unixdaddy opened this issue Jan 28, 2022 · 3 comments
Open

GKE cluster becomes unresponsive when provider installed #39

unixdaddy opened this issue Jan 28, 2022 · 3 comments
Labels
bug Something isn't working

Comments

@unixdaddy
Copy link

What happened?

From slack discussion

Deployed vm instance using crossplane provider-jet-gcp:v0.2.0-preview on GCP using GKE as my crossplane management cluster.
Even with GKE 1.23.1, 400+ CRDS meant the API server response was pretty slow

After creating a GKE 1.23.1 cluster and installing the jet provider the system is totally unresponsive for awhile (time for a 🍵 )

kubectl crossplane install provider crossplane/provider-jet-gcp:v0.2.0-preview

after awhile the system becomes responsive, however commands to crossplane resources can be slow

time kubectl get crossplane
I0126 17:13:21.538682    4976 request.go:665] Waited for 1.193141986s due to client-side throttling, not priority and fairness, request: GET:https://XX.XX.XX.XX/apis/notebooks.gcp.jet.crossplane.io/v1alpha1?timeout=32s
real    1m25.191s
user    0m21.987s
sys     0m2.499s

@muvaf has explained why

if you use full name of the kinds, you’ll get immediate results like
kubectl get dbinstance.rds.aws.jet.crossplane.io
For categories, like
kubectl get managed or kubectl get crossplane
there isn’t much one can do because all category queries have to go through every API, which increases with that many CRDs.
In short, I don’t expect category queries to get faster soon but we’ll see how discovery client can be improved in upstream and you can use full names until then 🙂

The question is how can the initial unresponsiveness - the install of the 438 CRDs - be mitigated or resolved or accepted.

How can we reproduce it?

  • create/switch to gcloud project
$ gcloud projects create crossplane-lab
$ gcloud config set project crossplane-test-lab
  • set your gcloud environment
$ gcloud config set compute/region us-west2 
$ gcloud config set compute/zone us-west2-a 
$ gcloud config configurations list 
  • identify the GKE version from RAPID Channel (1.23 is not available in all regions)
gcloud container get-server-config --format "yaml(channels)" --zone us-west2-a
Fetching server config for us-west2-a
channels:
- channel: RAPID
  defaultVersion: 1.22.3-gke.1500
  validVersions:
  - 1.23.1-gke.500 <-- latest
  - 1.22.4-gke.1501
  - 1.22.3-gke.1500
  - 1.22.3-gke.700
  - 1.21.6-gke.1500
  - 1.21.5-gke.1802
  • create a new 4vCPU 8GB Memory GKE 1.23.1 cluster in us-west2-a
gcloud container clusters create crossplane --machine-type=custom-4-8192 --num-nodes=1 --disk-size=30GB --cluster-version=latest --release-channel=rapid
  • add crossplane helm stable repo and install crossplane
NAME: crossplane
LAST DEPLOYED: XXXXX
NAMESPACE: crossplane-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Release: crossplane

Chart Name: crossplane
Chart Description: Crossplane is an open source Kubernetes add-on that enables platform teams to assemble infrastructure from multiple vendors, and expose higher level self-service APIs for application teams to consume.
Chart Version: 1.6.2
Chart Application Version: 1.6.2

Kube Version: v1.23.1-gke.500

system is responsive

  • install kubectl crossplane plugin
  • install GCP jet provider version 0.2.0 preview
    system is unresponsive and time for 🍵 also pods restarts a few times until things settle down
kubectl crossplane install provider crossplane/provider-jet-gcp:v0.2.0-preview

kubectl get all -n kube-system
Unable to connect to the server: net/http: TLS handshake timeout
Unable to connect to the server: net/http: TLS handshake timeout
Unable to connect to the server: net/http: TLS handshake timeout
Unable to connect to the server: net/http: TLS handshake timeout
Unable to connect to the server: net/http: TLS handshake timeout
Unable to connect to the server: net/http: TLS handshake timeout


time kubectl get pods -n crossplane-system
NAME                                                        READY   STATUS    RESTARTS        AGE
crossplane-cfb4cb44f-8qqlf                                  1/1     Running   3 (6m8s ago)    19m
crossplane-provider-jet-gcp-b5eb4524318a-7dd4b6fb44-pw8tk   1/1     Running   3 (5m54s ago)   9m31s
crossplane-rbac-manager-d9d6c54c7-75lbn                     1/1     Running   1 (8m33s ago)   19m

real    1m0.098s
user    0m0.059s
sys     0m0.027s

time kubectl get pods -n crossplane-system
The connection to the server XX.XX.XX.XX was refused - did you specify the right host or port?

real    0m37.761s
user    0m0.052s
sys     0m0.019s

After awhile the system becomes responsive but calls to crossplane resources can be slow if not using full name of the kinds as explained above.

What environment did it happen in?

Crossplane version: 1.6.2
Provider version: provider-jet-gcp:v0.2.0-preview
Cloud provider: GCP
Kubernetes version: GKE 1.23.1-gke.500

@unixdaddy unixdaddy added the bug Something isn't working label Jan 28, 2022
@muvaf
Copy link
Member

muvaf commented Jan 28, 2022

FWIW, regional clusters seem to work fine.

@hprotzek
Copy link

hprotzek commented Feb 24, 2022

I've just installed the 0.2.0-preview on a regional gke 1.22.6 cluster and I had to remove it again, as our helm/helmfile deployments run all into timeouts.

Maybe it would be good to have an include list of crd's that you want to use, instead of installing all 400 by default.

@muvaf
Copy link
Member

muvaf commented Mar 28, 2022

Hi @hprotzek crossplane/crossplane#2918 is where we document these issues, feel free to take a look at. crossplane/crossplane#2869 is the issue about selective installation that you can give a thumbs up and/or provide more details.

For the cluster becoming unresponsive, I'd suggest opening a ticket to GKE support team.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants