Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

artifacts: sort out remote bucket layout for multi-arch artifacts #463

Closed
lucab opened this issue Apr 3, 2019 · 18 comments
Closed

artifacts: sort out remote bucket layout for multi-arch artifacts #463

lucab opened this issue Apr 3, 2019 · 18 comments

Comments

@lucab
Copy link
Contributor

lucab commented Apr 3, 2019

Right now all artifacts produced by coreos-assembler are implicitly for a single architecture, and we currently only publish x86_64 outputs.

Eventually we are going to produce and publish artifacts for multiple architectures, available over HTTPS from some remote bucket/URL. To my knowledge, we currently don't expose the basearch label anywhere in the URL.

This ticket is to figure out the layout of the final multi-arch remote bucket, as consumers will need some way to pick up the relevant images for their arch.

@lucab
Copy link
Contributor Author

lucab commented Apr 3, 2019

/cc @jlebon @dustymabe as they may have some opinions on this already.

For reference, on ContainerLinux the basearch was part of the "board name" which was exposed as a separate directory hierarchy. See https://alpha.release.core-os.net/arm64-usr/ vs https://alpha.release.core-os.net/amd64-usr/.

@jlebon
Copy link
Member

jlebon commented Apr 3, 2019

One thing that makes this tricky is the build ids. Right now it's basically determined by the OSTree version of the commit + an "image generation version". If we instead made it external to the process (i.e. we stopped using automatic-version-prefix), then we could run builds on all arches in parallel. (I mean, in theory we could count on the individual runs to converge on the same next version, but I think we want something stronger here.)

Then we can pretty easily gather the builds from all the arches and present them however we want in the bucket.

If we want the same builds/$buildid directory to hold the artifacts for all the arches, then we need to add the arch to the filenames themselves. Actually, that's something we could do now regardless I think.

@dustymabe
Copy link
Member

One thing that makes this tricky is the build ids. Right now it's basically determined by the OSTree version of the commit + an "image generation version". If we instead made it external to the process (i.e. we stopped using automatic-version-prefix), then we could run builds on all arches in parallel. (I mean, in theory we could count on the individual runs to converge on the same next version, but I think we want something stronger here.)

I think this is a good point. Depending on how we set this up we could have a higher level first step in our entire build process that lays the foundation for the different architecture builds to happen and then have the different architecture builds happen in parallel.. i.e.

  • run high level "we want a build to happen, please generate some metadata" step
  • each arch then picks up that metadata and runs a build and populates artifacts in arch specific locations

Then we can pretty easily gather the builds from all the arches and present them however we want in the bucket.

If we want the same builds/$buildid directory to hold the artifacts for all the arches, then we need to add the arch to the filenames themselves. Actually, that's something we could do now regardless I think.

agree

@cgwalters
Copy link
Member

Strongly related to this is the concept of having a central orchestrator schedule builds on distinct machines (like Koji does). I don't think cosa should be an orchestrator (that's https://github.com/coreos/fedora-coreos-pipeline ) - but it should enable and define baseline mechanics.

Specifically I think the pipeline should schedule a cosa build on each arch, giving it a "scratch" space in storage (S3, ideally with temporary IAM creds that only allow writing to a transient space; or the pipeline exposes an API for uploading content). Then we have a cosa build-merge s3://fcos/tmp/build-XYZ/x86_64 s3://fcos/tmp/build-XYZ/ppc64le s3://fcos/tmp/build-XYZ/aarch64 that aggregates those into the single final build.

@jlebon
Copy link
Member

jlebon commented May 13, 2019

Some related discussions in #159 around build layout.

Specifically I think the pipeline should schedule a cosa build on each arch, giving it a "scratch" space in storage

Hmm, I might be missing something, though wouldn't all the nodes be connected to the same OCP cluster? So then we should be able to share scratch space and merge locally, right? E.g. that way we don't upload lots of images if one of the arches failed.

Actually, that's probably something we need to discuss as well as part of this (re. multi-arch nodes). AFAIK CentOS CI only has x86_64 which means we'd have to emulate other arches... Not sure offhand if Fedora infra's OpenShift cluster is multi-arch.

@dustymabe
Copy link
Member

Hmm, I might be missing something, though wouldn't all the nodes be connected to the same OCP cluster? So then we should be able to share scratch space and merge locally, right? E.g. that way we don't upload lots of images if one of the arches failed.

yes that would be nice.

Actually, that's probably something we need to discuss as well as part of this (re. multi-arch nodes). AFAIK CentOS CI only has x86_64 which means we'd have to emulate other arches... Not sure offhand if Fedora infra's OpenShift cluster is multi-arch.

I've brought that up with brian stinson several times. They have hardware I believe they just don't have everything wired up or a new enough version of openshift that supports it

@dustymabe
Copy link
Member

dustymabe commented May 31, 2019

we have hashed some of this out over in coreos/fedora-coreos-tracker#189.

@jlebon
Copy link
Member

jlebon commented Jun 26, 2019

So right now the FCOS pipeline is not strictly following the layout in coreos/fedora-coreos-tracker#189: builds are directly under builds/ instead of e.g. builds/{id}/x86_64/.

I'd like to fix this, even if we ship preview for x86_64 only. Though one question here is whether we want to make this structure native to cosa or something we do in post. I would much prefer the former, but it is also a significant change. Would probably want to modify the builds.json spec as well to be something like (translate to JSON in your head):

builds:
  - id: 30.1234
    archs:
      - x86_64
      - ppc64le
      - ...

And a top-level schema-version key so we can gate the behaviour. Could make it opt-in by just sticking to the same layout for existing workdirs and only doing this for new workdirs or those explicitly migrated. (And e.g. cosa buildprep s3://... would know which it's dealing with as well through the schema-version).

@cgwalters
Copy link
Member

Though one question here is whether we want to make this structure native to cosa or something we do in post.

Mmm...it'd kind of have to be a "mode" right? I wouldn't want that for local builds I think. Or, at least cosa run would have to learn to look in x86_64/latest?

@dustymabe
Copy link
Member

I'd like to fix this, even if we ship preview for x86_64 only. Though one question here is whether we want to make this structure native to cosa or something we do in post. I would much prefer the former, but it is also a significant change. Would probably want to modify the builds.json spec as well to be something like

+1 - I think I'd rather have COSA able to handle this so we get used to the structure everywhere.

One thing to keep in mind is that there will most likely still be some sort of post step because we'll have to merge the outputs of multiple builds (one for each arch).

@jlebon
Copy link
Member

jlebon commented Jun 26, 2019

I wouldn't want that for local builds I think.

Hmm, though if we don't change the default behaviour for new workdirs, we'll have to carry this forever, vs. hopefully one day being able to drop support for it entirely.

Or, at least cosa run would have to learn to look in x86_64/latest?

Right, the assumption here is that all cosa commands would be aware of the schema. Would have to sprinkle some helper functions.

@sinnykumari
Copy link
Contributor

I'd like to fix this, even if we ship preview for x86_64 only. Though one question here is whether we want to make this structure native to cosa or something we do in post. I would much prefer the former, but it is also a significant change. Would probably want to modify the builds.json spec as well to be something like (translate to JSON in your head):

+1 on doing this in cosa
Also, we may want to add arch field in artifacts name produced by cosa to easily identify target arch of the media by looking at artifact name

jlebon added a commit to jlebon/coreos-assembler that referenced this issue Jun 28, 2019
Still a lot of prep patches to split out and some more testing to do,
but it works pretty well so far.

```
$ find builds
builds
builds/builds.json
builds/30.1
builds/30.1/x86_64
builds/30.1/x86_64/coreos-assembler-config.tar.gz
builds/30.1/x86_64/coreos-assembler-config-git.json
builds/30.1/x86_64/fedora-coreos-30.1-qemu.qcow2
builds/30.1/x86_64/ostree-commit-object
builds/30.1/x86_64/meta.json
builds/30.1/x86_64/manifest-lock.generated.json
builds/30.1/x86_64/commitmeta.json
builds/30.1/x86_64/ostree-commit.tar
builds/30.1/x86_64/fedora-coreos-30.1-installer-kernel
builds/30.1/x86_64/fedora-coreos-30.1-installer-initramfs.img
builds/30.1/x86_64/fedora-coreos-30.1-installer.iso
builds/30.1/x86_64/fedora-coreos-30.1-metal.raw
builds/30.1/x86_64/fedora-coreos-30.1-openstack.qcow2
builds/latest

$ cat builds/builds.json
{
    "schema-version": "1.0.0",
    "builds": [
        {
            "id": "30.1",
            "archs": [
                "x86_64"
            ]
        }
    ],
    "timestamp": "2019-06-28T20:50:54Z"
```

For more context, see:
coreos#463 (comment)

The key thing to note here is that this only affects new workdirs
currently. I'll also be writing a separate migration tool so that we can
adapt S3 buckets to the new layout.

Adapting to this will definitely be a bit painful, but I think it's
worth it overall. Making this native to cosa means that we don't need a
translation layer between our S3 bucket and cosa, we can parallelize
things more easily in the future, and e.g. we can transparently use
"bulk" commands like `cosa buildupload` and `cosa compress`.
@jlebon
Copy link
Member

jlebon commented Jun 28, 2019

WIP in #580!

Also, we may want to add arch field in artifacts name produced by cosa to easily identify target arch of the media by looking at artifact name

Yeah, I think that makes sense. Will try to address that together as part of #580.

@jlebon
Copy link
Member

jlebon commented Jun 28, 2019

So if we add the basearch to artifacts names, an alternative approach here is to simply keep it all together as mentioned higher up. I think that would work, but would make the file hierarchy messier (except the devel case), and would still require some kind of breaking change to meta.json (either that, or have multiple meta.jsons per arch). Seems like if we're gonna make breaking changes, we might as well go the whole way?

@arithx
Copy link
Contributor

arithx commented Jun 28, 2019

Of the two options I think I'd prefer having the separate directories.

@jlebon
Copy link
Member

jlebon commented Jun 28, 2019

Of the two options I think I'd prefer having the separate directories.

To be clear, that's my preference as well. 😄
I realized afterwards that "go the whole way" could be misinterpreted.

jlebon added a commit to jlebon/coreos-assembler that referenced this issue Jul 12, 2019
For FCOS, we need to be able to drive versioning from outside of
`cosa build` (see [1]). This will also be used in the very short-term to
manually version the first few FCOS preview releases before making it
automated.

[1] coreos#463 (comment)
jlebon added a commit that referenced this issue Jul 12, 2019
For FCOS, we need to be able to drive versioning from outside of
`cosa build` (see [1]). This will also be used in the very short-term to
manually version the first few FCOS preview releases before making it
automated.

[1] #463 (comment)
@jlebon
Copy link
Member

jlebon commented Jul 17, 2019

I'm going to close this, as the original ask of this ticket is now fixed by #580 and coreos/fedora-coreos-tracker#208. There's some related work around versioning and orchestration that will still need resolving (see #463 (comment)), but let's track that in a separate issue to make it easier to follow.

@jlebon jlebon closed this as completed Jul 17, 2019
@jlebon
Copy link
Member

jlebon commented Jul 30, 2019

I split out the thread we had here about orchestration into #673.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants