⚠️ types tightening #384

grokspawn · 2024-09-06T20:22:28Z

Solves #381

codecov · 2024-09-26T01:23:17Z

Codecov Report

Attention: Patch coverage is 91.42857% with 3 lines in your changes missing coverage. Please review.

Project coverage is 35.36%. Comparing base (e57db1f) to head (f772ca5).
Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
...rnal/controllers/core/clustercatalog_controller.go	92.30%	1 Missing and 1 partial ⚠️
api/core/v1alpha1/zz_generated.deepcopy.go	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #384      +/-   ##
==========================================
- Coverage   36.05%   35.36%   -0.69%     
==========================================
  Files          14       14              
  Lines         796      803       +7     
==========================================
- Hits          287      284       -3     
- Misses        463      472       +9     
- Partials       46       47       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

api/core/v1alpha1/clustercatalog_types.go

everettraven · 2024-09-26T14:00:57Z

api/core/v1alpha1/clustercatalog_types.go

@@ -164,6 +164,9 @@ type CatalogSource struct {
 }

 // ResolvedCatalogSource is a discriminated union of resolution information for a Catalog.
+// ResolvedCatalogSource contains the information about a sourced Catalog
+// +union
+// +kubebuilder:validation:XValidation:rule="self.type != 'Image' || has(self.image)",message="sourceType 'Image' requires image field"


I think we may want something more like:

Suggested change

// +kubebuilder:validation:XValidation:rule="self.type != 'Image' || has(self.image)",message="sourceType 'Image' requires image field"

// +kubebuilder:validation:XValidation:rule="self.type == 'Image' && has(self.image)",message="sourceType 'Image' requires image field"

This ensures that we return "True" for the validation only if the type is Image and the image field is set.

IIUC, the current validation of self.type != 'Image' || has(self.image) returns "True" in scenarios that would be invalid:

type is set to Foo (not currently possible based on enums, but for sake of ensuring we have a solid CEL validation assume we can) and the image field is set

type is set to Image and the image field is unset

I ran into some issues when using self.type == 'Image' && has(self.image), mostly when I tried validating with an unsupported image type and when I tried adding another supported sourceType to test extensibility.

the +kubebuilder:validation:XValidation tag requires all its validation checks to be true, so we were failing on the self.type=='Image' part of the check even when the source type was something else. I added the self.type != 'Image' so we wouldn't be doing that check on non-image sources.

I've added another validation to ensure we don't have the image field on non-image sources to cover the case you highlighted, thanks for pointing that out! We'll still fail on non-supported sources because of the enum check, but this won't throw the type 'Image' requires image field error when the source type isn't 'Image'

I ran into some issues when using self.type == 'Image' && has(self.image), mostly when I tried validating with an unsupported image type and when I tried adding another supported sourceType to test extensibility

I think this is something that I would expect based on our allowed enum values.

the +kubebuilder:validation:XValidation tag requires all its validation checks to be true, so we were failing on the self.type=='Image' part of the check even when the source type was something else. I added the self.type != 'Image' so we wouldn't be doing that check on non-image sources.

I'm a bit confused here. Are you saying that it the XValidation tag doesn't evaluate the expression holistically but instead each individual expression within a chain? This sounds off to me - do you have any documentation or references you've found that state that is the case?

If I'm not mistaken, what seems more likely here is that the validations are failing on the fact that we only allow the Image value as part of our enum checks so we don't even hit the CEL validations.

Holistically I would expect the following test cases to fail, but not hit our CEL validations until more than the Image value is allowed:

Attempting to set type to something other than Image

Attempting to set type to something other than Image AND setting image field

For ^ I would avoid checking for specific error messages for now and just ensuring that we encounter an error.

One that I would expect us to fail from the CEL validation:

Attempting to set type to Image but nil image field

One that I would expect to succeed:

Attempting to set type to Image and set image field

I do think we should be able to satisfy all these cases with a singular CEL expression with the current state of our API. Future iterations to the API that add new source types will likely need to update the CEL validations so I would avoid trying to accommodate future use cases of different source type values in the CEL introduced as part of this PR

I think this is something that I would expect based on our allowed enum values.

I'd added another allowed value to our enums and regenerated manifests to test this case
// +kubebuilder:validation:Enum:="Image";"someothertype"

Are you saying that it the XValidation tag doesn't evaluate the expression holistically but instead each individual expression within a chain?

It seems that way from https://book.kubebuilder.io/reference/markers/crd-validation

// +kubebuilder:validation:XValidation marks a field as requiring a value for which a given expression evaluates to true. This marker may be repeated to specify multiple expressions, all of which must evaluate to true.

I verified this with the test I mentioned earlier. A catalog with the type "someothertype" would fail with or without the image field with
The ClusterCatalog "testcatalogothertype" is invalid: spec.source: Invalid value: "object": source type 'Image' requires image field

When we add another sourceType in the future, we will have to modify this expression. e.g.

(self.type == 'Image' && has(self.image)) || (self.type == 'Foo' && has(self.foo))

But since there's just one enum value now, I don't think we need to worry about making the CEL expression future proof.

// +kubebuilder:validation:Enum:="Image";"someothertype"

@ankitathomas I don't think this is a valid validation tag. I think that it is supposed to be:

// +kubebuilder:validation:Enum:="Image,SomeOtherType"

// +kubebuilder:validation:Enum:="Image,SomeOtherType"

That seems incorrect, the manifest generated that way considers the quoted value to be a single string, so it treats the valid value as "Image,SomeOtherType" rather than "Image" and "SomeOtherValue"

The ClusterCatalog "operatorhubio2" is invalid: spec.source.type: Unsupported value: "SomeOtherType": supported values: "Image,someothertype"

I also tried with // +kubebuilder:validation:Enum:="Image","someothertype" , that fails to generate the manifest with Error: not all generators ran successfully

// +kubebuilder:validation:Enum:="Image";"someothertype" works correctly though, if the CEL validation is removed, the catalog manifest with the "someothertype" applies without any issues

(self.type == 'Image' && has(self.image)) || (self.type == 'Foo' && has(self.foo))

I think we won't be able to tailor our validation errors if we make the CEL validation one long validation for all types. We'd also be throwing the CEL validation error when we encounter an unsupported type

apiVersion: olm.operatorframework.io/v1alpha1 kind: ClusterCatalog metadata: name: operatorhubio2 spec: source: type: unsupported

The ClusterCatalog "operatorhubio2" is invalid: * spec.source.type: Unsupported value: "unsupported": supported values: "Image", "Foo" * spec: Invalid value: "object": no such key: image evaluating rule: cannot specify PollInterval while using digest-based image * spec.source: Invalid value: "object": source type 'Image' requires image field

Not sure why it's throwing an error with the PollInterval though.

This was the expression set preserves a little more info about validation errors:

// +kubebuilder:validation:XValidation:rule="self.type != 'Image' || has(self.image)",message="source type 'Image' requires image field" // +kubebuilder:validation:XValidation:rule="self.type == 'Image' || !has(self.image)",message="image field must only be set for source type 'Image'" // +kubebuilder:validation:XValidation:rule="self.type != 'Foo' || has(self.foo)",message="source type 'Foo' requires foo field" // +kubebuilder:validation:XValidation:rule="self.type == 'Foo' || !has(self.foo)",message="foo field must only be set for source type 'Foo'"

an unsuported type gives the following error set:

The ClusterCatalog "operatorhubio2" is invalid: * spec.source.type: Unsupported value: "unsupported": supported values: "Image", "Foo" * spec: Invalid value: "object": no such key: image evaluating rule: cannot specify PollInterval while using digest-based image

A catalog with an invalid field gives something like:

kind: ClusterCatalog metadata: name: operatorhubio2 spec: source: type: Foo image: ref: ""

The ClusterCatalog "operatorhubio2" is invalid: * spec.source: Invalid value: "object": image field must only be set for source type 'Image' * spec.source: Invalid value: "object": source type 'Foo' requires foo field

We can combine the validations for each type so we don't end up too verbose

// +kubebuilder:validation:XValidation:rule="(self.type != 'Image' || has(self.image)) && (self.type == 'Image' || !has(self.image))",message="source type 'Image' requires image field" // +kubebuilder:validation:XValidation:rule="(self.type != 'Foo' && !has(self.foo)) || (self.type == 'Foo' && has(self.foo))",message="source type 'Foo' requires foo field"

Ah I didn't realize we could have multiple validation rules like that, but I guess it makes sense. That's nicer because we can tailor the message for each one.

If we want to consolidate to a single rule, we can use messageExpression instead of message in order to use CEL to build a message. But I'm not sure that would give us the flexibility we could need to craft a helpful validation message.

ankitathomas · 2024-09-26T15:02:46Z

A summary of changes :

capitalized catalog type values ('image' -> 'Image')
removed omitempty from spec.Priority field so printcols prints a default value of 0
removed status.ObservedGeneration field
added CEL validation rules to spec.CatalogSource and status.ResolvedCatalogSource + tests, removed other kubebuilder validation for the same
updated status.ResolvedCatalogSource.ref to reference the resolved image digest instead of tracking the value of spec.ref that the resolving/unpacking was performed on, removed status.ResolvedCatalogSource.resolvedRef
- updated check to stop unpack retries for unchanged digest-based images. Tag based images are retried if a polling interval is set.
removed status.ResolvedCatalogSource.LastUnpacked, switching to only use status.lastUnpacked to track when bundle was last unpacked successfully

…lidation for spec.source and status.resolvedSource Signed-off-by: Ankita Thomas <[email protected]>

joelanford · 2024-09-26T18:54:07Z

internal/controllers/core/clustercatalog_controller.go

 	status.ResolvedSource = result.ResolvedSource
 	status.ContentURL = contentURL
-	status.ObservedGeneration = generation
-	status.LastUnpacked = unpackedAt
+	status.LastUnpacked = result.LastTransitionTime
 	meta.SetStatusCondition(&status.Conditions, metav1.Condition{


Should we be setting observedGeneration inside the condition struct?

joelanford · 2024-09-26T18:57:21Z

internal/controllers/core/clustercatalog_controller.go

+		return false
+	}
+	if _, isDigestBased := imgRef.(reference.Canonical); isDigestBased &&
+		catalog.Spec.Source.Image.Ref != catalog.Status.ResolvedSource.Image.Ref {


Do we have a bug issue that captures that we are reading from our own status? IMO, we should instead be storing something in our cache that includes its digest, and we compare with that.

would it be easier to drop the spec.image vs resolved image comparison here and re-implemented later when we optimize our caching?

I think we're better off keeping it vs removing it. But something I think we should come back to and improve.

I couldn't find an issue for this specifically; created #420 to track it.

joelanford · 2024-09-26T18:59:00Z

internal/controllers/core/clustercatalog_controller.go

-	}
-	// time to unpack
-	return true
+	nextPoll := catalog.Status.ResolvedSource.Image.LastSuccessfulPollAttempt.Add(catalog.Spec.Source.Image.PollInterval.Duration)


Another place that we are reading from status. We can probably fold this into the other bug issue I mentioned. Another thing that we should store/lookup from our cache.

Is this something we can fold into our reconcile intervals? If we set the requeuing interval to the polling interval, would we effectively get the same behaviour?

We're using nextPoll to set the RequeueAfter already I think. The problem is that we don't want to poll every time we reconcile and we also don't want to reset our timer and wait too long. So we need reconciles (R) and polls (P) that look like the following:

poll1 now poll2 -----RP-------------R---------R--------------RP 0 1 2 3 4 5 6 7 8 9

If our interval is 8 and we reconcile and poll at time 1, we don't want to poll again until time 9, even if we reconcile in the meantime. So if we reconcile again at time 4, we need to calculate a RequeueAfter based on the current time, the last successful poll time and the interval.

In this case, we want:

(now - poll1) + requeueAfter = interval requeueAfter = interval - now + poll1 requeueAfter = poll1 + interval - now requeueAfter = 1 + 8 - 4 requeueAfter = 5

If we reconcile again at time 6, that's too early again, so we need to recalculate requeueAfter (this time, it would be 3) so that we're always timing for another reconcile to happen at the poll interval.

This particular case of reading from status isn't that bad because the worst case scenario is that we fail to update status after a successful poll attempt. If that happens, we'll end up re-polling the image registry every single reconcile until status update succeeds.

If we read from an in-memory map, out lastSuccessfulPoll map would not survive restarts, so we would end up re-polling every single ClusterCatalog at cluster startup. This concern is partially why we're using jitter to slowly spread poll times of different ClusterCatalogs around, even if they have the same interval.

That makes sense, I was curious because it seemed like we don't do anything apart from setting finalizers on catalogs that aren't ready for polling. Once we move to an async reconcile, we'd want to reconcile more frequently than our polling interval.

I've created #420 to track this and the previous bug you pointed out

joelanford · 2024-09-26T19:01:54Z

internal/source/containers_image.go

-		Message: fmt.Sprintf("unpacked %q successfully", canonicalRef),
+		State:              StateUnpacked,
+		Message:            fmt.Sprintf("unpacked %q successfully", canonicalRef),
+		LastTransitionTime: lastUnpacked,


From the perspective of this library, this is timestamp for when we unpacked what we have. So I'd suggest UnpackTime for the name of this field.

Suggested change

LastTransitionTime: lastUnpacked,

UnpackTime: lastUnpacked,

joelanford · 2024-09-26T19:02:41Z

internal/source/containers_image_test.go

@@ -381,8 +381,8 @@ func TestImageRegistry(t *testing.T) {

 			rs, err := imgReg.Unpack(ctx, tt.catalog)
 			if !tt.wantErr {
-				require.NoError(t, err)
-				assert.Equal(t, fmt.Sprintf("%s@sha256:%s", imgName.Context().Name(), digest.Hex), rs.ResolvedSource.Image.ResolvedRef)
+				assert.NoError(t, err)


This probably needs to stay as require. Otherwise the dereferencing fields on rs will panic.

joelanford · 2024-09-26T19:04:12Z

internal/source/unpacker.go

@@ -54,6 +56,9 @@ type Result struct {
 	// Message is contextual information about the progress of unpacking the
 	// catalog content.
 	Message string
+
+	// LastTraansitionTime is the timestamp when the transition to the current State happened
+	LastTransitionTime metav1.Time


I think this should be a regular timestamp. We can convert to metav1.Time when we're in the kubernetes object domain. This library isn't in that domain.

Suggested change

LastTransitionTime metav1.Time

UnpackTime time.Time

…edGeneration in conditions

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 6, 2024

grokspawn force-pushed the more-types-updates branch from 6d3e91c to de1f42e Compare September 9, 2024 16:12

openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 14, 2024

types tightening

73038f8

ankitathomas force-pushed the more-types-updates branch from de1f42e to b8cbda9 Compare September 26, 2024 01:22

openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 26, 2024

everettraven reviewed Sep 26, 2024

View reviewed changes

api/core/v1alpha1/clustercatalog_types.go Outdated Show resolved Hide resolved

everettraven reviewed Sep 26, 2024

View reviewed changes

ankitathomas force-pushed the more-types-updates branch 4 times, most recently from 0d394ca to 1d2538c Compare September 26, 2024 14:57

LalatenduMohanty removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 26, 2024

everettraven marked this pull request as ready for review September 26, 2024 16:06

everettraven requested a review from a team as a code owner September 26, 2024 16:06

switch to status.lastUnpacked for sole unpacked timestamp, use CEL va…

28e951a

…lidation for spec.source and status.resolvedSource Signed-off-by: Ankita Thomas <[email protected]>

ankitathomas force-pushed the more-types-updates branch from 1d2538c to 28e951a Compare September 26, 2024 16:18

joelanford reviewed Sep 26, 2024

View reviewed changes

rename unpacker result's lastTransitionTime to unpackTime, set observ…

f772ca5

…edGeneration in conditions

ankitathomas mentioned this pull request Sep 27, 2024

Avoid using resource status for reconcile logic #420

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

⚠️ types tightening #384

⚠️ types tightening #384

grokspawn commented Sep 6, 2024

codecov bot commented Sep 26, 2024 •

edited

Loading

everettraven Sep 26, 2024

ankitathomas Sep 26, 2024

everettraven Sep 26, 2024

ankitathomas Sep 26, 2024 •

edited

Loading

joelanford Sep 26, 2024 •

edited

Loading

everettraven Sep 26, 2024

ankitathomas Sep 26, 2024 •

edited

Loading

ankitathomas Sep 26, 2024 •

edited

Loading

joelanford Sep 27, 2024

ankitathomas commented Sep 26, 2024

joelanford Sep 26, 2024

joelanford Sep 26, 2024

ankitathomas Sep 26, 2024

joelanford Sep 27, 2024

ankitathomas Sep 27, 2024

joelanford Sep 26, 2024

ankitathomas Sep 26, 2024

joelanford Sep 27, 2024

ankitathomas Sep 27, 2024

joelanford Sep 26, 2024

joelanford Sep 26, 2024

joelanford Sep 26, 2024

	// +kubebuilder:validation:XValidation:rule="self.type != 'Image' \|\| has(self.image)",message="sourceType 'Image' requires image field"
	// +kubebuilder:validation:XValidation:rule="self.type == 'Image' && has(self.image)",message="sourceType 'Image' requires image field"

⚠️ types tightening #384

Are you sure you want to change the base?

⚠️ types tightening #384

Conversation

grokspawn commented Sep 6, 2024

codecov bot commented Sep 26, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ankitathomas Sep 26, 2024 • edited Loading

Choose a reason for hiding this comment

joelanford Sep 26, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ankitathomas Sep 26, 2024 • edited Loading

Choose a reason for hiding this comment

ankitathomas Sep 26, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ankitathomas commented Sep 26, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Sep 26, 2024 •

edited

Loading

ankitathomas Sep 26, 2024 •

edited

Loading

joelanford Sep 26, 2024 •

edited

Loading

ankitathomas Sep 26, 2024 •

edited

Loading

ankitathomas Sep 26, 2024 •

edited

Loading