Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Django collectstatic's symlink mode #1060

Open
edmorley opened this issue Sep 1, 2020 · 3 comments
Open

Enable Django collectstatic's symlink mode #1060

edmorley opened this issue Sep 1, 2020 · 3 comments

Comments

@edmorley
Copy link
Member

edmorley commented Sep 1, 2020

For Django apps, the buildpack currently runs the Django management command collectstatic automatically as part of the build. This command by default copies the static files provided by any Django packages/the local app into the staticfiles directory.

The collectstatic command has a --link option which enables the use of symlinks instead of file copies:
https://docs.djangoproject.com/en/2.2/ref/contrib/staticfiles/#cmdoption-collectstatic-link

Using this option would save duplicating files within the app directory, reducing slug size and speeding up the build (since symlinking is quicker than copying).

This idea was proposed in #1033 - filing this issue to give somewhere to discuss the idea.

Things we'll need to investigate/consider:

  • what versions of Django support --link and are there any known bugs that prevent its use on older versions?
  • are there any scenarios where it's not a drop in replacement (eg alternative staticfiles backends, or ...?)
  • should we make this the default behaviour, or should it be controllable?
  • we should probably ensure the metrics emitted allow determining pass vs fail and duration for link vs standard

In the meantime, anyone wanting to enable symlinks can do so by disabling the built-in collectstatic (by setting the env var DISABLE_COLLECTSTATIC on their app, per docs), then running collectstatic with custom command in a bin/post_compile script (see #1026).

@illia-v
Copy link

illia-v commented Jul 24, 2023

In the meantime, anyone wanting to enable symlinks can do so by disabling the built-in collectstatic (by setting the env var `` on their app, per docs), then running collectstatic with custom command in a bin/post_compile script (see #1026).

./manage.py collectstatic --link creates absolute symlinks to a temporary build directory /tmp/build_* when it is executed in bin/post_compile. The symlinks become broken in a real dyno when the app directory becomes /app. So an additional step is needed in bin/post_compile to change the absolute links to relative to prevent the problem.

Something like this solution with find ./static -lname "$(pwd)/*" may help because links to files from external libraries should remain absolute.

@edmorley
Copy link
Member Author

edmorley commented Aug 7, 2023

@illia-v Ah that's a good point, thank you. I hadn't tried --link yet so didn't know it used absolute symlinks instead of relative. Shame Django doesn't allow controlling this.

For our in-progress next-generation buildpacks (Cloud Native Buildpacks; upstream info: https://buildpacks.io / our Python CNB: https://github.com/heroku/buildpacks-python) Django's use of absolute symlinks wouldn't be an issue, since the build time and run time paths are the same.

@heroku heroku deleted a comment from git2gus bot Aug 11, 2023
@heroku heroku deleted a comment from git2gus bot Aug 11, 2023
@edmorley
Copy link
Member Author

edmorley commented Aug 30, 2023

Another issue I've found, is that Django's ManifestStaticFilesStorage storage backend (on which WhiteNoise's backend is based) doesn't preserve symlinks when creating the hashed version of files. That is, staticfiles/example.css will be created as a symlink to the original asset, but staticfiles/example.019c8743b7cf.css will be a copy.

At first glance this seems like a bug or limitation of Django's ManifestStaticFilesStorage backend, however, given the target of the symlink could in theory be modified at any point (which would then mean the hash in the filename no longer matches the actual hash of the file contents), then I can see why they might do that. (In a Heroku context where the staticfiles directory is regenerated fresh each time this isn't a concern, but I suspect the upstream Django project might not accept a fix to make the hashed files symlinks too).

As such, using collectstatic --link will only help reduce the size of the "original" unhashed filename assets, and not the hashed versions (and also not the gzip/brotli compressed versions created by the WhiteNoise compression backend, since they have completely different contents).

These original filename assets already get deleted when using WhiteNoise's WHITENOISE_KEEP_ONLY_HASHED_FILES setting, which the Python getting started guide already uses:
https://github.com/heroku/python-getting-started/blob/7a862828b78de73e13f87c26254be0d16eec1354/gettingstarted/settings.py#L183-L185

So whilst I'll still use --link in the upcoming Python CNB Django collectstatic implementation regardless, it means:

  1. It will actually offer minimal benefits to projects already using the best-practice WhiteNoise configuration,
  2. People using the current buildpack can still get the same size reduction benefits (even before switching to the CNB), just be enabling WHITENOISE_KEEP_ONLY_HASHED_FILES too.

edmorley added a commit to heroku/buildpacks-python that referenced this issue Sep 13, 2023
The classic Heroku Python buildpack automatically runs the Django
`collectstatic` command:
https://github.com/heroku/heroku-buildpack-python/blob/main/bin/steps/collectstatic

This adds equivalent support, with a couple of improvements:
- This implementation performs more checks to see whether the app is
  actually using the static files feature before trying to run it
  (reducing the number of cases where users would need to manually
  disable it).
- The collectstatic symlink feature has been enabled, as requested in
  heroku/heroku-buildpack-python#1060.
- Symlinked `manage.py` files are now supported, as requested in
  heroku/heroku-buildpack-python#972.
- The error messages are finer grained/more useful.
- There are many more tests (including now testing legacy vs latest
  Django versions, to check the CLI arguments used work for both ends of
  the spectrum).

There is currently no way to force disable the feature (beyond removing
`django.contrib.staticfiles` from `INSTALLED_APPS` in the app's Django
config, or removing the `manage.py` script). Supporting this depends on
us deciding how best to handle buildpack options, so will be added
later, in #109.

The build log output and error messages are fairly reasonable already
(and a significant improvement over the classic buildpack), and will be
further polished as part of the future build output overhaul.

The implementation uses the new `utils::run_command_and_capture_output`
added in #106.

See:
* https://docs.djangoproject.com/en/4.2/howto/static-files/
* https://docs.djangoproject.com/en/4.2/ref/contrib/staticfiles/
* https://docs.djangoproject.com/en/4.2/ref/settings/#settings-staticfiles

Fixes #5.
GUS-W-9538294.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants