Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore alternate build systems #304

Open
5 tasks
davidchisnall opened this issue Oct 3, 2024 · 9 comments
Open
5 tasks

Explore alternate build systems #304

davidchisnall opened this issue Oct 3, 2024 · 9 comments

Comments

@davidchisnall
Copy link
Collaborator

We originally built CHERIoT RTOS with CMake. This had a number of problems because CMake is not designed to allow users to build abstractions. We replaced it with xmake prior to open sourcing, and xmake is mostly fine.

Requirements

We need to be able to build four kinds of things:

  • Special things (mostly the loader)
  • Shared libraries
  • Compartments
  • Firmware images

Ideally, we also want to be able to add other composable chunks, which can be made dependencies of compartments or provided separately, including:

  • Threads (for example, it would be nice if the TCP/IP compartment could describe the worker thread for TCP processing as a dependency)
  • Pre-shared objects

Shared libraries and compartments have a custom link step, as do firmware images.

Firmware images need to be able to walk their dependency graph and build a linker script containing all of the libraries and compartments.

Board description files (JSON, kind-of JSON5, could be YAML) need to be parsed to generate compiler flags, inputs to the firmware linker script, and so on.

The build system must be able to generate compile_commands.json files for IDE integration. It's 2024, this is no longer a 'nice to have' optional extra.

For secure build environments, we must be able to do builds from a read-only mount of the source tree. In-tree builds are an antipattern.

We should be able to build abstractions that allow very concise definitions of projects. The current xmake build system is fairly good here. Going through the simplest hello-world example line by line:

set_project("CHERIoT Hello World")

This is slightly redundant because we need to name firmware images and so the overall project name is not very meaningful.

sdkdir = "../../sdk"
includes(sdkdir)
set_toolchains("cheriot-clang")

This is quite ugly. We are forced to hard-code the path of the SDK (more on this later) and then, because of xmake's scoping, also specify the toolchain here, even though the SDK files already specified it.

option("board")
    set_default("sail")

This is necessary because we don't advertise the board option from the SDK, we make each firmware select it. This is probably the right choice. Some projects may want to build firmware for different projects (the build2 proposed solution here is to have different build configs for each), others may not allow the board to be an option because they're specialised for a single SoC / board.

compartment("hello")
    -- memcpy
    add_deps("freestanding", "debug")
    add_files("hello.cc")

Again, this is concise. We are defining a compartment called hello, it depends on the freestanding and debug libraries (could be compartments). It has a single source file, hello.cc. This will be compiled with the default options for a compartment and linked as a compartment.

firmware("hello_world")
    add_deps("hello")

We're starting to define a firmware image called hello_world, it depends on the hello compartment.

    on_load(function(target)

Now we see some of the less-nice bits of xmake. You can't extend the xmake declarative syntax, you can only add properties in the imperative world, so we now need to write a Lua function that runs when this target is loaded.

        target:values_set("board", "$(board)")

In declarative syntax, this would be simple set_board("$(board)"), but that's a small issue.

        target:values_set("threads", {
            {
                compartment = "hello",
                priority = 1,
                entry_point = "say_hello",
                stack_size = 0x200,
                trusted_stack_frames = 1
            }
        }, {expand = false})

This defines a single thread, its entry point, and so on. It would be nice to have this be a type that could be automatically type checked (the build2 prototype does this), but this is largely fine.

The really ugly thing here is the { expand = false }. This is not properly documented in the xmake docs, but is essential to avoid xmake flattening the object that you've put here.

Problems with xmake

Although xmake works, we have encountered a lot of problems over time.

Poor defaults

A build system should loudly complain if things are wrong.

If you add a nonexistent file (for example, add it with a .c extension instead of .cc, or fail to commit it to git), xmake does not complain. It simply builds and, if the link step works, the build succeeded. You probably didn't need the file anyway. We had a test that the FreeRTOS compat headers all compiled in C mode that was spuriously passing for months as a result of this (I added the xmake bits, forgot to add the .c file, and CI passed).

By default, xmake hid compiler warnings (this was fixed in v2.8.7). This is the worst possible default. The argument from the maintainer was that 'most users' didn't need to see them. I believe that most users of a build system are developers and they absolutely need to see warnings. The second most common users are people building packages, and they should see warnings because they may help debug problems. The third category are people who are building things from source but are not developers, and these people are a rounding error and probably don't care either way.

If you add a compiler flag, xmake will drop it if it isn't supported. It doesn't try passing it to the compiler to see if it's supported, it has some internal model of the compiler (which may or may not be relevant for a cross-compile toolchain) and ignores it. You must add {force = yes } (which was undocumented the first time I needed it, I'm not sure if it's documented now) to make xmake do the thing that you explicitly told it to do.

Similarly, if you pass a positional argument with any of the flag-adding functions, even with {force = true}, and xmake doesn’t know about the argument, then it will deduplicate it. Things like -Xclang or -mllvm, which need to be passed multiple times are silently removed. There is no documented way of avoiding that, but there is an undocumented way, which is to pass the pair of flags as a Lua array and add expand = false in the last argument.

Everything in xmake is a string. The type of a build rule is a string. If you make a typo, it will silently make the target use the default rule.

Insecure searching

Unless you explicitly specify a target path, xmake looks up the directory tree to find the xmake.lua furthest from you. This may be across a mount point and controlled by a different user. This is an incredibly insecure default.

Poor scoping

Targets in xmake must have globally unique names. Some things are local to the file, others are not. For example, you must set the toolchain in the top-level file, you can't provide it in the SDK file.

Extensions are second class

Internally, xmake defines a lot of different target types. These have access to a load of infrastructure that is exposed. Anything that is implemented externally lacks access to this infrastructure and so cannot work as well as things built into the build system.

Opaque dependency tracking and poorly specified execution order

We have several custom rules. It's not clear how to tell xmake that a given file is a dependency of them. Sometimes, the only way to get a safe rebuild is rm -rf .xmake build.

xmake also provides its own caching, which is not invalidated when the toolchain is updated. This breaks reproducible builds and introduces bugs when a new toolchain feature is required.

Options are not processed until after the declarative syntax is processed (and the declarative syntax isn't really declarative). This means that you can't use an option to provide the SDK path. We'd like to be able to have build files that have a default path for the SDK, but let you override it. With xmake, we need to modify xmake.lua.

Similarly, it's unclear what order the on_load functions for different targets will run in. We tried using the support that was added for dynamically cloning targets for the allocator, but this was impossible because the new target's on_load things didn't run.

There probably is some internal model for how builds work, but the author has not written it down anywhere.

Other options

There are a few potentially promising alternatives that we should explore before we do a 1.0 release and need to support whatever we ship for a long time. For some reason, they seem to be things that match the pattern B.*2.

Build2

Build2 is a fairly new build system that is designed to support modern C++ features. @boris-kolpackov has built us a proof of concept that can build the hello-world example. We need to:

  • Port the other examples / tests / and so on to use this.
  • Refine the proof-of-concept
  • Either extend build2 to be able to parse hex literals in JSON or add a preprocessing tool for board description files.
  • Provide a flow to generate compile_commands.json (not yet working out of the box and a must-have feature).
  • If it works, add an xmake target that will generate build2 files from existing projects.

It looks quite promising. I don't like the fact that compilation is conceptually a separate step. To me, compilation flags depend on the target (library, executable, whatever) and the fact that you have to declare separate cxx things is not ideal.

I also don't like the focus on short names in build2. These are hard to search for and they place a higher cognitive load on the programmer. Most developers will spend 1% of their time touching the build system, so a build system needs to be easy for people to context switch back to. I find a lot of the build2 logic hard to read.

I also don't like the fact that every project needs three files in two directories to build.

Buck2

Buck2 is a rewrite of Facebook's Buck build tool. It uses Starlark, a Python-like language, for scripting and exposes all of the rules, which means that our build rules would be equivalent to first-class rules from other places.

Buck2 is written in Rust and provides a single statically linked binary that is easy to deploy.

We currently have no proof-of-concept in Buck2, but it is probably worth exploring.

Others?

CMake was used originally and its limitations were too great.

Bazel has some nice properties, but it has too big a dependncy chain. Bazel requires both a working Java VM and a Python installation. This is fragile and far too big a supply-chain attack surface for something in every build.

tup / make are low-level tools that rely on third-party things if you need abstractions. We don't want to maintain a tup generator.

Any others that we should evaluate?

@waruqi
Copy link
Contributor

waruqi commented Oct 4, 2024

Now we see some of the less-nice bits of xmake. You can't extend the xmake declarative syntax, you can only add properties in the imperative world, so we now need to write a Lua function that runs when this target is loaded.

Xmake supports custom description scope interfaces. You can customize the firmware and compartment scope api, as well as all configuration interfaces, like this:

https://github.com/xmake-io/xmake/blob/dev/xmake/includes/xpack/xmake.lua

https://github.com/xmake-io/xmake/blob/6c973f55a3130fe62342e4ab06e992eac6ee3a80/xmake/plugins/pack/xpack.lua#L521

xmake-io/xmake#1433 (comment)

Targets in xmake must have globally unique names.

v3.0 will provide namespace to support it. xmake-io/xmake#5527

If you add a compiler flag, xmake will drop it if it isn't supported. It doesn't try passing it to the compiler to see if it's supported

you can disable this policy. set_policy("check.auto_ignore_flags", false)

@davidchisnall
Copy link
Collaborator Author

Xmake supports custom description scope interfaces. You can customize the firmware and compartment scope api, as well as all configuration interfaces, like this:

Are these documented? I thought they were only for internal use. If we can use them, we could tidy up a lot of our code.

It would be great if we could support add_threads and add_preshared_object on our target types and remove the need for downstream users to write on_load things in most cases.

v3.0 will provide namespace to support it

Yay!

you can disable this policy.

This comes back to the scoping problems. Yes, I can disable it, but I have to set this policy in every file. I can’t just put it at the top of the SDK file that everyone includes and have it fixed.

And that’s related to the ‘sane defaults’ problem. It is a very bad design choice for xmake to decide that it knows what flags an unknown version of a third-party tool supports. This will always be wrong because xmake has no visibility into what the tool is. As such, it gets added to the list of things where, in every project, I need to change the defaults.

@waruqi
Copy link
Contributor

waruqi commented Oct 4, 2024

Are these documented? I thought they were only for internal use. If we can use them, we could tidy up a lot of our code.

no document now. But it's public api. you can see above links.

and see xmake-io/xmake#4276

and whole example:

https://github.com/xmake-io/xmake/blob/dev/tests/apis/custom_scopeapis/xmake.lua

@davidchisnall
Copy link
Collaborator Author

Thanks, that should be enough for us to define the library/firmware/compartment targets properly. The two big things that we're missing beyond that are:

  • The ability to pass an option that tells the project script where to find the SDK xmake.lua, which provides all of these things. Possibly if we restructured the SDK scripts as a plugin, that would be easy? It would be nice to have the option to specify a default location in the project's xmake.lua (so projects that vendor the SDK can just point to their submodule location and not need to specify it, projects that don't can just copy it)
  • Being able to specify the toolchain things in our SDK files (setting the toolchain and other policy things in the included SDK file are not done).

Ideally, the xmake file linked above would look like this:

set_project("CHERIoT Hello World")

-- Some mechanism to provide this on the command line.
sdkdir ?= "../../sdk"
includes(sdkdir)
-- No set_toolchains here, imported from sdkdir


option("board")
    set_default("sail")

compartment("hello")
    add_deps("freestanding", "debug")
    add_files("hello.cc")

-- Firmware image for the example.
firmware("hello_world")
    add_deps("hello")
    set_board("$(board)")
    add_thread({
                  compartment = "hello",
                  priority = 1,
                  entry_point = "say_hello",
                  stack_size = 0x200,
                  trusted_stack_frames = 1
               })

@waruqi
Copy link
Contributor

waruqi commented Oct 4, 2024

sdkdir = os.getenv("SDK") or "../../sdk"
includes(sdkdir)
SDK=/xxxx/sdk xmake

@sleffler
Copy link
Contributor

sleffler commented Oct 4, 2024 via email

@boris-kolpackov
Copy link

A couple of questions/clarifications about build2:

a fairly new build system

The project is almost 10 years old. The first commit has the 2014-12-03 date (and there were months before that thinking with a pen and paper).

Provide a flow to generate compile_commands.json (not yet working out of the box and a must-have feature).

I am actually wrapping up the compilation database support and should have it available in the staged toolchain in a few days. Will ping this issue when it's ready with a link to the documentation.

I don't like the fact that compilation is conceptually a separate step. To me, compilation flags depend on the target (library, executable, whatever) and the fact that you have to declare separate cxx things is not ideal.

Hm, I am not sure I follow. Compilation is definitely not a separate steps if we are talking about the build execution. Could you expand on what you mean here?

I also don't like the fact that every project needs three files in two directories to build.

You can have a project with a single buildfile though it has some limitations. We've been gradually lifting these limitations replacing them with sensible defaults. So we can definitely explore supporting what you have in mind with a single buildfile.

@davidchisnall
Copy link
Collaborator Author

The project is almost 10 years old. The first commit has the 2014-12-03 date (and there were months before that thinking with a pen and paper).

CMake is 24 years old, Make is from the dawn of the Epoch. Anything from this Millennium (xmake, build2, bazel, buck2) is new to me.

I am actually wrapping up the compilation database support and should have it available in the staged toolchain in a few days. Will ping this issue when it's ready with a link to the documentation.

Yay!

Hm, I am not sure I follow. Compilation is definitely not a separate steps if we are talking about the build execution. Could you expand on what you mean here?

In the rule descriptions that I read, you had cxx rules and then compartment rules that depended on the cxx rules.

This is one of the things I quite like about xmake: Rules have a map-reduce structure. There's a map phase (take a load of inputs, turn it into an equal number of outputs) and a reduce phase (take those outputs and combine them into a smaller number of outputs). Both of those steps are optional, but they follow my mental model of how I build software. The .cc -> .o step is not a distinct rule, it is an implementation detail of how I build a program/library/compartment.

You can have a project with a single buildfile though it has some limitations. We've been gradually lifting these limitations replacing them with sensible defaults. So we can definitely explore supporting what you have in mind with a single buildfile.

Great! I really want people to have to read/write/maintain the smallest possible amount of boilerplate.

@boris-kolpackov
Copy link

The .cc -> .o step is not a distinct rule, it is an implementation detail of how I build a program/library/compartment.

True typically, but not generally. The first example that comes to mind is unit testing: for a unit test you may need to link directly to the object file of a library rather than to the library itself (in order to get access to non-exported bits). So in build2 we chose to make the model general enough to accommodate such use-cases but also support automatic intermediate dependency synthesis (o: cc) for the typical case. The result is that as a user you are not exposed to such intermediate targets until you need to. Perhaps xmake does this differently, but the end result for the typical case should look the same, at least from the user's perspective.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

4 participants