Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finalize artifact storage strategy #169

Closed
jlebon opened this issue Apr 4, 2019 · 16 comments
Closed

Finalize artifact storage strategy #169

jlebon opened this issue Apr 4, 2019 · 16 comments

Comments

@jlebon
Copy link
Member

jlebon commented Apr 4, 2019

In the last community meeting, we determined that one blocker for some of the Cincinnati and stream tooling work was the aggressive pruning from the developer pipeline. Right now, it's outputting into the CentOS CI artifact server, but we only keep a few of the latest builds due to a technicality. It's also very slow to download from.

But more broadly though, we should figure out how we want to host FCOS artifacts for the preview release. We need to chat with Fedora releng and see e.g. if there are cloud bucket accounts FCOS could use.

Once we have access to a bucket, we can rework the pipeline to output to that bucket so we can start building history and unlock the next work items.

Do note that for the OSTree content itself, we'll eventually want to output those into Fedora's unified OSTree repo, backed (fronted?) by CloudFront. This is another integration point we'll want to discuss with releng as well, though for starters we could just publish the OSTree repo together with the other artifacts?

@arithx
Copy link
Contributor

arithx commented Apr 4, 2019

Here's some assorted topics relating to artifact storage / release process from a recent discussion I had w/ @bgilbert

  • Making bucket/object(s) public as part of the release process. The build pipeline could upload to the bucket with either the bucket or the individual objects being non-public, the release process itself will take on changing bucket/object permissions
  • Maybe have a non-listable bucket that has public objects; this allows all objects to have HEAD & GET operations against them while not allowing directory listing. Thus all uploaded objects will remain usable if the URL is known but the only publicly listed objects are the latest published (NOTE: we should still have these images uploaded with non-public permissions due to predictable naming and mark them as public during the release process)
  • Either use CDNs on the buckets or use the S3 IPv6 website endpoint if pointing directly to the bucket (this allows us flexibility later if we want to change bucket locations / services)

@dustymabe
Copy link
Member

One question I have: even if we choose to use cloud storage (i.e. s3 or equivalent) can we make the implementation not specific to cloud storage features? A couple of reasons for this. 1) If we ever need to pivot in the future to using just a simple web server setup, could we do it? 2) if I want to mimic the production work flow for a test it would be nice if it didn't require access to cloud credentials.

@lucab
Copy link
Contributor

lucab commented Apr 5, 2019

Maybe have a non-listable bucket that has public objects

That's fine for consumers, as long as we have a builds.json listing them (like we currently do).

@jlebon
Copy link
Member Author

jlebon commented Apr 17, 2019

We had a meeting with members of Fedora releng this morning about this. There is consensus on getting an S3 bucket set up. Filed https://pagure.io/fedora-infrastructure/issue/7719.

@bgilbert bgilbert added the meeting topics for meetings label Apr 17, 2019
@jlebon
Copy link
Member Author

jlebon commented Apr 17, 2019

Was going to file another ticket about getting a subdomain for this bucket, but maybe we should bikeshed here first before that. Suggestions so far are:

  • builds.coreos.fedoraproject.org
  • artifacts.coreos.fedoraproject.org
  • images.coreos.fedoraproject.org
  • release.coreos.fedoraproject.org

Any others?

One thing that might affect bikeshedding this: remember hosts will be upgrading from the unified OSTree repo (at ostree.fedoraproject.org), so not the bucket directly, which would be more for new installs.

I don't have strong opinions on this. I'd probably just go with release.coreos.fedoraproject.org? Matches CL and doesn't feel too implementation oriented.

@arithx
Copy link
Contributor

arithx commented Apr 17, 2019

I'd personally lean towards release. or builds.

@bgilbert
Copy link
Contributor

If we're going to put development builds in this bucket, we shouldn't call it release.

@jlebon
Copy link
Member Author

jlebon commented Apr 17, 2019

Just to write this down: one thing we were discussing too was using a redirector since it might give us more control over the release process and URLs we expose to clients. The downside of course is that it's another piece in the critical path that needs to maintain uptime. Which I think just depends on how stable whatever cluster we run this on is.

@bgilbert bgilbert removed the meeting topics for meetings label Apr 17, 2019
@puiterwijk
Copy link

@jlebon we have a reasonably stable proxies cluster (we use that for metalink). Alternatively, we could try to run something in Lambda.

@jlebon
Copy link
Member Author

jlebon commented Apr 17, 2019

we have a reasonably stable proxies cluster (we use that for metalink)

Hmm, are those clusters running on OCP?

Alternatively, we could try to run something in Lambda.

Yeah, though I'd be worried with our release pipeline being too AWS-specific at that point.

Overall though, I think there's agreement on a fedoraproject.org subdomain to front the bucket even if we do end up using a redirector, right?

If we're going to put development builds in this bucket, we shouldn't call it release.

Fair enough. Thoughts on just builds. ?

@puiterwijk
Copy link

@jlebon no, they don't run OCP, they run plain podman and apache, so if we can get a container that serves HTTP over some port, that would work.
On Lambda: I think the redirector is likely going to be small enough, and the redirector-specific bits small enough, that the Lambda-specific code could be made very minimal? Just the parsing of request vars and encoding of response is specific I'd say.

@jlebon
Copy link
Member Author

jlebon commented Apr 18, 2019

Ticket for builds.coreos.fedoraproject.org: https://pagure.io/fedora-infrastructure/issue/7722

@puiterwijk
Copy link

Ticket for builds.coreos.fedoraproject.org: https://pagure.io/fedora-infrastructure/issue/7722

And resolved.

@bgilbert
Copy link
Contributor

Details of bucket layout are in #189.

The current plan is not to have directory/bucket listings enabled for our release bucket, nor predictable artifact URLs. Instead, artifact URLs will be provided in the stream metadata JSON (#98), which will always contain references to the current recommended images for each platform. The download page will also fetch the stream metadata and use it to provide the correct links.

Disabling listings solves several problems:

  • It avoids users assuming that the most recent release is always the best one to install. That's generally true, but might not be true if there's a regression in the latest image.
  • It discourages users from installing old releases, since they're less discoverable. We'll still provide a way to find them, but will encourage users to obtain artifacts via stream metadata instead (directly or indirectly).
  • It discourages users from coming to rely on directory structure, filenames, or other details of the uploaded artifact set.

@jlebon
Copy link
Member Author

jlebon commented Jun 6, 2019

Seems like we can close this now in favour of the other implementation-oriented tickets we already have?

@jlebon
Copy link
Member Author

jlebon commented Jun 13, 2019

Closing as per previous comment.

@jlebon jlebon closed this as completed Jun 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants