Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spike/remote verification #42

Merged
merged 22 commits into from
Oct 7, 2024
Merged

Spike/remote verification #42

merged 22 commits into from
Oct 7, 2024

Conversation

revoltez
Copy link
Contributor

@revoltez revoltez commented Aug 29, 2024

Overview

This PR introduced image signing, verification, and policy enforcement & tpod application remote verification. It also includes various organizational enhancements to the codebase in an attempt to Fix #35.

Key Features:

  1. Image Signing & Verification:

    • Add Image Signing: Implemented support for signing images, in deploy & upload commands. Signatures are stored locally (so that they could also be verified locally without requiring to fetch them from the registry) but they could also be uploaded with the --upload-signatures flag
    • Verification of Images: Added Image verification of pods, remote image in a registry or remote application.
    • Remote Application verification: Anyone can query an application endpoint to retrieve informations about all the images used in the running application and verify them using the Verify Command. the verify Command can take either an image name directly or a URL.
  2. Control over Verification Settings:

    • Added VerificationDetails Field For images: Publisher can specify the following details that will be later used for image verification:

      • Issuer: certificate issuer
      • Identity: certificate Identity
    • Added VerificationSettings field: has the following options:

      • ForcePolicy: when set to true, the application namespace will create a policy per image and enforce that the supplied image matches its policy otherwise deployment will fail. the policy will be configured with the VerificationDetails field for keyless verification.
      • PublicVerifiability: When set to true, the application will be available to be remotely verified by anyone
      • VerificationHost: Manually Set the Verification Host which will be used to access the application for remote verification
  3. Various Codebase enhancements: (re-organized and decoupled scripts for better maintainability and better flow during development)

How it works

  • Image Signing & verification: when the developer specify the --sign-images flag, the command (deploy/upload) will loop through the pod images, sign them by calling cosign methods. a certificate request will be sent to sigstore fulcio certificate authority and the signature is saved locally and in the registry (when specifying --upload-signatures). and a new log is added in sigstore Rekor transparency log of this operation. verification is done the same way, given image information (URL, certificate identity & Issuer), the commands will call into cosign verify methods to get the signature (could specify the location locally with --signature-path) verify the transparency log and the signature.

  • Forcing Image Policy: Added sigstore policy controller chart which will watch for new created namespaces with policy.sigstore.dev/include label, this means that any image that exists in this namespace will be forced to match at least one image policy. which is why we create a policy per image with the image verification details. after creating the policy; the controller will automatically verify the image URL in the pods and make sure they originate from the identity & issuer specified otherwise deployment will fail.

Remote application verification

  • Problem

    • we need to make sure images used in the application are signed by the identity the deployer claim to have specified.
    • We cannot trust the application to relay those informations.
    • Only tpodserver could be trusted in retrieving insights about the running pods.
    • Verifiers only have an endpoint. don't have informations about the internals of the application such as namespace name which is necessary to know from where we retrieve the app info.
    • The verification needs to be closely coupled with the ingress endpoint (as specified in the issue).
    • Need a place to store all images details (issuer, identity, signatures, attestations, ...etc).
    • Need to infer the namespace the request is heading to without requiring it from the user.
  • Solution:

    • Since we already have keda interceptor which will route requests based on the Host Header.
    • Added another Host to route for application verification With automatically created subdomain when possible.
      • During pod deployment if the deployer does not manually set VerificationHost in VerificationSettings. the command will fetch the first HTTP Host and create a subdomain from it that will be used to retreive application information.
      • Ex: if the application is accessible via app.apocryph.cloud, another subdomain will be added automatically like this: app.apocryph.tpodinfo.cloud
    • Druing pod deployment, store all images verification details in annotations (cannot use labels due to restriction on size).
    • Create another httpSo with min & max replicas set to 1 that points to another new created service which points to new pod; tpod-proxy which has the namespace name injected as an Environment variable.
    • tpod-proxy will simply retrieve the namespace name from Environment variable and forward the request to tpodserver to retreive the app informations by calling it via its FQDN and forward the response back to where it came from.
  • What cannot work

    • Using an external serivce to route requests, keda requires an endpoint and external services don't create endpoints and even if we create a ClusterIP service with no selectors and then manually create the Enpoints/EndpointSlice that point to tpodserver keda still require as a condition to mange the pod number of replicas before routing and since its in another namespace, the condition will never be met.
  • Since the namespace is enforcing the images to have matching policies, and since we cannot trust the deployer to create the tpod-proxy, it means it must be created by the official comrade-coop identity, which is why a global image policy is created and the template values for the identity and the issuer are set during deployment of the charts.

  • A cool feature about this approach is that due to the creation of the proxy alongisde the app within the same namespace, at least the provider is not doing "completely" free work and get a small execution fee for retrieving the info.

Tests

In both tests you must specify the identity and issuer for the images.

  • Added integration test for the new commands in /test/integration/sigstore/run-test
  • Added the e2e test in /test/e2e/minikube/run-attestation-test.sh (require a deployed apocryph cluster, you can set it up in /test/e2e/minikube/run-test.sh

Libraries

  • cosign: the reason for choosing cosign over sigstore-go is that the latter is too low level and cosign will depend on sigstore-go in the future and the methods that we used are the cobra commands directly without any modification (except args ofcourse) to avoid unecessary more tech debt.

revoltez and others added 22 commits August 20, 2024 09:23
Couldn't call the command directly because it depends on a global
variable rootOptions which is necessary for timeout, but that is
actually good since we now can have more control over the options
Used Cosign instead of the low level sigstore-go because Cosign will
eventually depend on and its not going anywhere.
- Added the verify command
- Added the verify flag for deploy
- Integration test updated
…egistry

Add upload-signatures flag in case user wants to upload to the registry
remove manual deployment with hardcoded values
…at the client

For consistency with similar features.
...This might still run into some issues when removing and re-adding containers in a pod.
@bojidar-bg
Copy link
Contributor

bojidar-bg commented Oct 7, 2024

Added a few changes after reviewing the code. Looking good!

(Note: changes are untested. Hopefully I didn't break anything.. too bad! 🥲)

@bojidar-bg bojidar-bg merged commit 0541bac into master Oct 7, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Spike: Remote verification of application
2 participants