Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-40605: Improve Algolia audit job's reliability #144

Merged
merged 14 commits into from
Sep 5, 2023

Conversation

jonathansick
Copy link
Member

@jonathansick jonathansick commented Sep 1, 2023

  • Handle and log exceptions when trying to queue ingestion for missing documents
  • Log a summary of how many documents were queued for reingestion.
  • Fixed the base_url attribute's JSON alias for the Algolia DocumentRecord model. Was baseURL and is now restored to baseUrl.
  • Fix typo in creating records for Lander content types (source_update_time and source_update_timestamp fields).

There can be all sorts of issues when queueing an ingest, so we'll just
log their exceptions and continue so we can get a sense of why these
documents are failing.
This was baseURL in the Pydantic model, but should be baseUrl to match
the existing Algolia schema.
Rather than use logging.exception, which places the stack trace outside
the structured log, using logging.error instead and log the exception
within the JSON structured log message.
- Format the kafka key/value data as a dict for natural JSON
  representation; the json() method was creating a string.
- Bind this context to the logger so that all log messages share it.
- Add the number of records and the surrogate key of those records so
  they can be searched later.
Configure the logger when running ook audit and other CLI apps to get
structured logging. This is useful since the CLI is primarily used for
Kubernetes Jobs.
See if this exception is now nicely formatted by structlog for us.
Turns out structlog does format the exception trace in the structured
log after all, so this is fine.
@jonathansick jonathansick merged commit b813002 into main Sep 5, 2023
4 checks passed
@jonathansick jonathansick deleted the tickets/DM-40605 branch September 5, 2023 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant