Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] consistently support object stores #5656

Closed
Bobgy opened this issue May 17, 2021 · 5 comments
Closed

[feature] consistently support object stores #5656

Bobgy opened this issue May 17, 2021 · 5 comments
Labels
area/backend kind/feature lifecycle/stale The issue / pull request is stale, any activities remove this label.

Comments

@Bobgy
Copy link
Contributor

Bobgy commented May 17, 2021

Feature Area

/area backend

What feature would you like to see?

Consistently support different object stores (MinIO, S3, GCS, Azure blob, ...) in KFP, no matter it's for UI preview / visualization, metrics or in pipeline tasks.

Propose to unify using Go CDK: https://gocloud.dev/.

Changes:

  1. In https://bit.ly/kfp-v2-compatible mode, KFP pipeline tasks will use Go CDK to upload/download artifacts in pipeline containers.
  2. Implement support for different artifact stores in KFP API server, also make it deployable standalone in user namespaces.
  3. Remove artifact API implementation in KFP UI.

What is the use case or pain point?

  1. Reduce tech debt

KFP UI uses MinIO js client, GCS client etc to access some object stores.
KFP API server uses MinIO Go client to only access MinIO.
Argo workflow supports several types of object stores natively too (but not sure what they use).

The difference in implementation makes it hard to achieve the same level of support for different object stores across KFP features. This also brings duplicate efforts in maintaining object access.

e.g. KFP v1 metrics is only supported with MinIO, because KFP API server only supports MinIO.

  1. Consistently support object store features:

e.g. IRSA for S3: #3405 (comment)
or workload identity for GCS (needed an upgrade for argo)

Is there a workaround currently?

No


Love this idea? Give it a 👍. We prioritize fulfilling features with the most 👍.

@karlschriek
Copy link

Sorry for the slow response on this, but I can confirm that go-cloud supports IAM Roles for Service Accounts in order to access objects on S3.

Support was added with this PR: google/go-cloud#2773

I've also tested it independently. The IAM role needs to be attached to a ServiceAccount via an annotation as follows:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: go-cloud-test
  namespace: mynamespace
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/go-cloud-test

Any Pod that is started with this ServiceAccount will then inherit the role. (go-cloud uses the aws-cli in order to resolve the attached role into a temporary session). As long as the role arn:aws:iam::123456789012:role/go-cloud-test has the necessary policies and trust relationships attached to it, the following psuedo-deployment should work out of the box:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: go-cloud-test
  namespace: mynamespace
  labels:
    app: go-cloud-test
spec:
  replicas: 1
  selector:
    matchLabels:
      app: go-cloud-test
  template:
    metadata:
      labels:
        app: go-cloud-test
    spec:
      serviceAccount: go-cloud-test
      containers:
      - name: go-cloud-test
        image: my/testimage
        workingDir: /go/src/go-cloud/samples/gocdk-blob #see https://github.com/google/go-cloud/tree/v0.23.0/samples/gocdk-blob
        command: ["/bin/sh","-c"]
        args: ["go run main.go download s3://my-test-bucket-4134451 hello.txt > foo.txt]

@stale
Copy link

stale bot commented Mar 3, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Mar 3, 2022
@davidspek
Copy link
Contributor

/lifecycle freeze

@stale stale bot removed the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Apr 26, 2022
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Jun 22, 2024
Copy link

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/backend kind/feature lifecycle/stale The issue / pull request is stale, any activities remove this label.
Projects
None yet
Development

No branches or pull requests

4 participants