Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for pre_compile and post_compile steps #14

Open
ipmb opened this issue Mar 6, 2023 · 20 comments
Open

Support for pre_compile and post_compile steps #14

ipmb opened this issue Mar 6, 2023 · 20 comments
Labels
classic buildpack parity Features required for parity with the classic Heroku Python buildpack

Comments

@ipmb
Copy link

ipmb commented Mar 6, 2023

Just tested out v0.1.0 and noticed this was missing. It looks like you have some other issues to get it to parity with the legacy buildpack, so just dropping this one in too.

Thanks for your work on this!

@edmorley
Copy link
Member

edmorley commented Mar 6, 2023

Thank you! Yeah this is on the list of things to consider - though it's possible I'll be dropping support for this in favour of people using the built-in inline buildpack feature instead, since it's not ideal for each language to reimplement custom run support. (For languages where there are built-in hooks, it's perhaps still worth supporting those, but for Python post_compile is proprietary so doesn't seem worth preserving in the CNB.)

I'll leave this open to track that decision, and if the feature is dropped we'll need to add warning (or error) messages to ease the transition regardless.

@edmorley edmorley added the classic buildpack parity Features required for parity with the classic Heroku Python buildpack label Mar 6, 2023
@luzfcb
Copy link

luzfcb commented Mar 6, 2023

I haven't tried this new way of deployment yet, but I would like to leave a comment on the way I use post_compile today.
My use case for post_compile is as follows:

I have an e-commerce platform made with Django + Frontend application in React. The deployment process has integration with Sentry releases.

In Heroku Classic, you do not have access to GIT and repository, just an environment variable that informs which git commit hash of a particular deploy flow has started.
You also do not have access to information when using pipelines and use functionality to promote to Production a Heroku Slug running on Staging

The application has 2 independent react applications (one uses old things and another uses new and incompatible things, and we are gradually migrating to the new application)

We use pre_compile for:
1 - Perform compilation tasks for the 2 react applications, so that everything is prepared for when python/django compilation tasks are performed
2 - Upload to Sentry releases of the Source Maps JS files generated in the previous step
3 - Since Heroku Classic has no direct access to the Git Repository, Sentry-Cli is not able to automatically get all the information needed to create a new release in Sentry Releases, so I must do so manually by running a custom script to consult the Sentry API to find out which one was the hash of last previous deployment made with successful status (That is, a finalized release) and then generate the range of commits to which this deploy possibly covers and manually starts the creation of a new release on Sentry Releases

post_compile:

1 - We perform Django's Collectstatic
2 - If all previous steps are executed sucessfully, we execute the Finalize release step of Sentry Releases to confirm that the application was successfully implanted. https://docs.sentry.io/product/cli/releases/#finalizing-releaseseses

@edmorley edmorley changed the title Support for post_compile Support for pre_compile and post_compile steps Mar 6, 2023
@edmorley
Copy link
Member

For CNBs, the standardised way of running custom commands before/after buildpacks is now using the new built-in inline buildpacks feature, which is documented here:
https://buildpacks.io/docs/app-developer-guide/using-inline-buildpacks/

For language ecosystems where there is a standardised convention (for that language) way of running commands, I think it makes sense for the buildpacks for that language to keep supporting those conventions (for example, the Node.js buildpack supporting the package.json scripts feature for things like the build command etc).

However, for language ecosystems like Python where there is no such convention, I don't think it makes sense to preserve the classic Python buildpack's proprietary bin/{pre,post}_compile steps, given there is a built in feature for custom commands in the CNB spec (inline buildpacks).

As such, I think I may not end up adding support for this in the Python CNB. Though I may end up adding some explicit error handling to ease the transition. (Which could for example print an example inline buildpack config to the build log, which users can then copy directly into their project.toml.)

@ipmb
Copy link
Author

ipmb commented Jul 20, 2023

I understand the motivation here, but project.toml feels like a big regression in DX/ergonomics.

Previously I could do something like this:

pack build myimage -b heroku/nodejs,heroku/python -B heroku/buildpacks:20

If I'm understanding the inline buildpacks, I now need to add a project.toml like this:

[_]
schema-version = "0.2"
id = "io.buildpacks.my-app"

[[io.buildpacks.group]]
id = "heroku/nodejs"
version = "1.1.2"

[[io.buildpacks.group]]
id = "heroku/python"
version = "0.4.0"

[[io.buildpacks.group]]
id = "me/post-compile"

  [io.buildpacks.group.script]
  api = "0.9"
  inline = "echo hello"

and then I can run:

pack build myimage -B heroku/builder:22

Versions seem required in project.toml so you lose auto-updating. It also requires more knowledge of buildpack internals (schema_version, api, io.buildpacks...).

Maybe this is outside of Heroku's scope, but a thought I had was that a separate "run-script" buildpack could be published which executes a script from a predefined location (or location provided by env var)?

@edmorley
Copy link
Member

The buildpack version in project.toml is an optional field, and defaults to "latest", so auto-updating of buildpack versions won't be affected. See:
https://buildpacks.io/docs/reference/config/project-descriptor/#iobuildpacks-_table-optional_

I agree that users may get the syntax of the project.toml file incorrect, however, we can and should document it clearly + have clear validation error messages (either upstream in the CNB lifecycle, or else in Pack CLI + the Heroku next gen build system).

Ref having to create a project.toml - in your example you would already have had to create such a file, since that app uses both Python and Node.js, and the CNB auto-detection would have only picked one language. (Just like currently Heroku only auto-detects a single language when no buildpacks are set, and for anything else you have to set explicit buildpacks using heroku buildpacks:set.)

Whilst the project.toml file won't be mandatory, it will be needed for any app that:

  • uses more than one official language buildpack (as above)
  • uses a third party buildpack
  • wants to pin to a non-latest version of an official buildpack
  • wants to use a non-default Heroku stack (or more correctly, CNB builder), or wants to control when the stack is updated rather than relying on the new Heroku default builder concept

In addition, app.json may eventually end up being merged into project.toml, meaning that a project.toml file may be needed for the Heroku CI / Review App use-cases too.

As such, I think having a project.toml will be fairly common for anything but beginner apps. And beginner apps don't tend to use bin/{pre,post}_compile.

In general though, I much prefer having buildpacks/stacks be configured via code rather than heroku buildpacks:set / heroku stack:set, since implicit differing state between apps is a regular source of user confusion ("why does my staging app work but my production app not, I'm git pushing the same source to both" etc).

@edmorley
Copy link
Member

a thought I had was that a separate "run-script" buildpack could be published which executes a script from a predefined location

Yeah this is another viable solution too.

Though perhaps a third option would be to ask for a simpler project.toml syntax for "run commands" that don't require the full verbosity of the current "inline buildpack" concept?
https://github.com/buildpacks/rfcs#proposal

@ipmb

This comment was marked as resolved.

@ipmb
Copy link
Author

ipmb commented Jul 20, 2023

Thanks for the thoughtful write-up @edmorley! I see where you're coming from and the value in following a standard. If good examples are provided, the project.toml is bearable, but heroku buildpacks:set feels far more user friendly 🙂

@edmorley
Copy link
Member

edmorley commented Jul 20, 2023

What am I missing?

Oh strange! I know there are some heuristics around interpreting a plain URI identifier (ie: to work out if it's a Docker Hub reference, or say a CNB registry reference; xref https://buildpacks.io/docs/app-developer-guide/specify-buildpacks/#uri-examples), perhaps there is a bug around that and the default of latest? I personally haven't tried defining buildpacks in project.toml yet (we're still very early stages in the overall CNB / CNB tooling journey), so maybe the best place to ask for now might be https://github.com/buildpacks/pack/issues or https://slack.cncf.io/ (in one of the #buildpacks / #buildpacks-* channels).

Edit: This bug was fixed upstream by buildpacks/pack#1873.

but heroku buildpacks:set feels far more user friendly 🙂

I agree it is seemingly more friendly at first glance, but it actually does cause quite a few issues that likely do not surface for experienced/diligent users (such as yourself :-)) but often trip beginners up - such as:

  • the classic footgun of "buildpacks:set" clobbering existing buildpacks
  • users not realising buildpacks have an order, or knowing about --index to buildpacks:set or buildpacks:add
  • users updating the buildpacks in app.json after an app was first created (rather than running buildpacks:{add,set}), and not knowing why the app doesn't run the new buildpacks. (Particularly for Review Apps, since the buildpacks in app.json are used for the first build, but not for subsequent)
  • different apps in their account running the same codebase but showing different behaviour and users not knowing why (either because of them having completely different buildpacks, or even more subtle cases of one app being pinned to an older version of an official buildpack compared to another)

In addition, with CNBs we finally have a much better story for running a build locally that matches production in the form of pack build etc. However, given that heroku buildpacks:set stores state remotely in Heroku's API/database for a specific app, how would pack build know what buildpacks to run, unless that's defined in the codebase?

@ipmb

This comment was marked as resolved.

@edmorley

This comment was marked as resolved.

@ipmb

This comment was marked as resolved.

@edmorley

This comment was marked as resolved.

@jose-fully-ported
Copy link

As such, I think having a project.toml will be fairly common for anything but beginner apps. And beginner apps don't tend to use bin/{pre,post}_compile.

I've long used bin/{pre,post}_compile files in heroku apps to install binaries I needed in apps at runtime that were annoying to install otherwise. I don't think its something that is "optional" by any means, but maybe Heroku has telemetry on it's usage?

@jose-fully-ported
Copy link

Would it be best for someone (me, you, someone) to publishpre-compile and post-compile buildpacks that detects based on the existence of the respective files? I don't know if there is a way to control order here and wrap all other buildpacks other than specifying it in a particular order within project.toml (or --buildpack flag if using pack).

@edmorley
Copy link
Member

edmorley commented Nov 2, 2023

The issue here is that bin/{pre_post}_compile are an entirely proprietary Heroku classic Python buildpack thing - they are not used by any other Heroku language buildpack, nor are they used by any other non-Heroku Python ecosystem tool.

IMO one of the big wins of the migration from classic buildpacks to CNBs (for all Heroku languages) is that of switching from proprietary features/concepts to open standards and modern best practices. For example:

  • slugs -> OCI images
  • closed source tooling -> open source tools that can be run locally to build the exact same image (pack build)
  • app build configuration (eg stack, list of buildpacks) being stored in an external API -> taking more of an infrastructure as code approach (project.toml), so your local builds match production
  • For Python specifically:
    • runtime.txt -> .python-version (coming soon)
    • bin/{pre_post}_compile -> upstream inline buildpack feature

For some of these transitions (for example runtime.txt), the Python CNB will support both the old and the new method - but will recommend the new method in docs and show a deprecation warning for the old approach in the build logs.

For some other differences between the classic Python buildpack and the CNB, the feature will be dropped and result in an error that explains how to migrate.

I haven't decided yet which approach to use for bin/{pre_post}_compile (hard error with migration advice, or deprecation warning) - however, I don't feel "support it as the preferred/recommended approach" (or replacing it with a custom pre-compile buildpack replacement) is even an option we should be considering as a best practice moving forwards.

Ultimately "run a custom command before/after one of my buildpacks" is not a Python specific feature - and should be something handled by the upstream CNB project (and already is - though there are likely UX improvements that could be made).

If you have suggestions for UX improvements to how the upstream inline buildpack feature work, I'd strongly recommend opening some issues or starting a discussion upstream:
https://github.com/buildpacks/community/discussions
https://github.com/buildpacks/rfcs

Ultimately end users being able to influence the design of buildpack APIs and tooling is another of the advantages of switching to an open standard - so please do take advantage of that! 😄

@ipmb
Copy link
Author

ipmb commented Nov 2, 2023

It's not as pretty, but adding a project.toml like this now works and gives me a workaround for post_compile

[_]
schema-version = "0.2"
id = "io.buildpacks.my-app"

[[io.buildpacks.group]]
id = "heroku/nodejs"

[[io.buildpacks.group]]
id = "heroku/python"

[[io.buildpacks.group]]
id = "my-app/post-compile"

  [io.buildpacks.group.script]
  api = "0.9"
  inline = "bin/post_compile"

If your post_compile script is a shell script, this will get you a header that matches the others in the build output:

echo -e "\n\e[1;35m[post_compile]\e[0m"

@edmorley
Copy link
Member

edmorley commented Nov 2, 2023

Thank you for the example - glad to see that works!

Using an inline table allows for reducing the boilerplate a bit more (depending on personal taste for TOML style):

[[io.buildpacks.group]]
id = "my-app/post-compile"
script = { api = "0.9", inline = "bin/post_compile" }

I'll open a PR against the upstream docs to make them use the inline table approach in the rake package example on:
https://buildpacks.io/docs/app-developer-guide/using-inline-buildpacks/

@edmorley
Copy link
Member

edmorley commented Apr 29, 2024

@ipmb Hi! I happened to see:
https://apppack.io/blog/using-bin-post-compile-in-heroku-python-cnb-buildpacks/

In that you say:

I hope that in the future, the Python buildpack regains this functionality similar to how build scripts are handled by the Node.js buildpacks. Part of the joy of working with buildpacks is that "things just work" without needing to understand all the internals. Adding a project.toml file breaks that illusion.

I'm not sure the comparison to Node.js build scripts under the package.json scripts.build key makes sense. That feature is based around an ecosystem convention of (a) having script aliases under scripts in general (and mentioned in the spec for package.json), (b) it being very common for the build step's scripts entry to be called build and common tools (such as create-react-app) populating the build entry by default. It's those aspects that are why it makes a lot of sense for the Node.js buildpack to support this language convention since many apps will just work out of the box on Heroku. Plus if you were to ask an average Node.js developer how to perform a production build of their Node.js app, I suspect many of them would know about {npm,yarn} run build, since it's an existing/understood language concept.

However, the same cannot be said for Python, where there is (a) no built-in way to specify command aliases in general (even for use-cases like an app's custom internal "lint" command etc), (b) no common convention for what to call the "build" step or how to run it. The closest one can get to that is tool-proprietary features like Poetry's run alias feature, however, (a) not all package managers/tools support something like that, (b) for those that do, each tool uses a different way of specifying the list of aliases, (c) there isn't a standardised command alias naming convention for the step that should be run "for production builds, after package install" etc, (d) the scaffolding tools and example templates for popular frameworks don't configure/use these aliases (understandably, since there is no single way to support all the tools).

Perhaps in the future pyproject.toml might support a standardised app scripts/commands alias section with a convention for a specific command alias for post-install commands (if you're feeling adventurous, you could start a PEP?), at which point I would be happy to support that too.

In the meantime, the CNB inline buildpack feature seems like a pretty good standard to migrate people to, given that with CNBs we now actually have a standard (and not just some adhoc "run" classic buildpacks). (I'd also question the blog post calling it "internals" - buildpacks are part of the public API to users, and project.toml and/or the inline buildpack feature are public features of CNBs.)

However, I definitely want the need to migrate away from {pre,post}_compile to be discoverable, so will add some warning/error messages. (Doing this is in my queue after the a few other higher priority things, like Heroku-24, multi-arch support, migrating to Buildpack API 0.10/latest libcnb.rs etc).

@ipmb
Copy link
Author

ipmb commented Apr 29, 2024

I hear what you're saying... Node definitely has more of a standard here, but I don't think Python is so far off that you couldn't do something similar.

The Node buildpack already supports non-standard Heroku-specific keys to run scripts (heroku-postbuild, heroku-prebuild`, etc). So there is precedence in not just using the standard, but defining something that is used specifically by Heroku buildpacks.

There may be a little debate on it still, but pyproject.toml is the package.json of the Python world. It has a standard way for arbitrary tools to define configuration within it.

My take is that it would be much more ergonomic for Python developers to define the scripts using pyproject.toml (something they are probably already familiar with), than having to understand some of the internals of buildpacks to perform a common task. One of the huge benefits of buildpacks is that developers never have to leave their ecosystem. They use buildpacks because they don't want to learn how to use Docker. This breaks that model. For somebody who doesn't know buildpacks, it's confusing, non-intuitive, and error-prone.

Something like this in pyproject.toml would be superior from the end-user perspective and is very similar to how the heroku-* values work for Node:

[tool.heroku-buildpack]
postbuild = "./bin/post-compile"

Yes, there is no python run for that, but it is a very tiny feature for the buildpack to parse the TOML and execute the command if it exists.

I can see this is probably going to be an agree-to-disagree situation, just trying to clarify my stance here :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
classic buildpack parity Features required for parity with the classic Heroku Python buildpack
Projects
None yet
Development

No branches or pull requests

4 participants