Skip to content

davidbarratt/cache-tag

Repository files navigation

Cache Tag

Cloudflare has the ability to index cached resources by tag which allows those resources be purged by tag. However, this feature is only available for Enterprise customers.

Despite this limitation, an index can be built using Workers, D1, and Queues.

Architecture

Architecture Diagram

This application is broken up into three Workers, three Queues, and one D1 database.

This worker watches requests to the Cloudflare Cache / Origin, captures the tags, and sends them to the Controller in order to be persisted.

Important

By the time a response from an origin reaches a Worker, Cloudflare has already swallowed the Cache-Tag header and it is no longer available. To get around this, the worker reads the custom X-Cache-Tag header instead.

The worker also exposes a /.cloudflare/purge endpoint that allows tags to be purged. This endpoint matches the interface of the Cloudflare endpoint, but only allows tags. The tags that are purged will be scoped to the zone in which the request is made too. For example, a purge request to https://example.com/.cloudflare/purge would only purge resources from the example.com zone.

A Worker is an account-level resource, but Cache is a zone-level resource. Because of this, there is no way to know what zone a resource is being cached in from a Worker.

To mitigate this problem, we can leverage the CF-Worker header which gets added to outbound requests from a Worker. Unfortunately, this header does not exist when using Service Bindings. The only way to retrieve the header is by making a request to the worker on the provided workers.dev subdomain.

The Controller exists primarily as an intermediary between Watcher and Handler to collect zone information. It is not included as a part of Handler in order to ensure that the worker is collocated in the same data center as Watcher.

The worker also exposes a /purge endpoint that allows tags to be purged. This endpoint matches the interface of the Cloudflare endpoint, but only allows tags. If no zone information is provided (via the CF-Worker header), matching resources from all zones will be purged.

After receiving and validating requests to either the /capture or /purge endpoints, the worker adds the requests to the cache-capture and cache-purge-tag queues respectively.

This worker listens to all three queues and handles them.

When a message is received from Controller in the cache-capture queue; the URL, zone, and tags are stored in the D1 database.

A message received from Controller in the cache-purge-tag queue results in the URLs being looked up in the D1 database from the provided tag, and re-queing those URLs by adding each one to the cache-purge-url queue. Since this will result in the resource being eventually removed from the cache, the URL and all tags associated with it are removed from the D1 database.

Finally, when a message is received from the cache-purge-url queue, the URLs are purged with Cloudflare's API.

Usage

I am not aware of a good way to distribute this application for use on your own other than forking it and modifying it. It is licensed under the AGPL-3.0 license so you are free to modify it under the terms of that license. I thought about using Terraform in order to make it easier for others to deploy on their own, but it seemed like overkill for my purposes. I'm happy to accept PRs that make life easier.

Origin Setup

In addition to running the suite of Cloudflare Workers, there is a bit of work on the origin server that needs to be done. Thankfully, this is effectively the same as the setup for the standard cache tag purging

  1. On cacheable responses, add a X-Cache-Tag header in the same format as the standard Cache-Tag header
  2. When a change occurs, use the /.cloudflare/purge endpoint on Watcher (or for all zones, the /purge endpoint on Controller) to purge by tag.

If you are using Drupal, you can install and configure the Cloudflare Worker Purge module and these steps will be done for you.

Cloudflare Setup

Finally, there is some setup in Cloudflare that is identical to the setup for the standard cache tag purging

  1. Ensure that the origin is proxied through Cloudflare.
  2. Create a Cache Rule and ensure that the appropriate resources are cached.

Authentication

I chose to use the API_TOKEN secret for authentication/authorization to the Controller and to use the same token to make requests to the Cloudflare API. This simplified the approach by only having to have a single secret in the worker and sharing that secret with the Origin server. This allows the origin to make requests to the Cloudflare API or the Worker seamlessly.

The minimum API Token permissions needed are:

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published