Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize retrieval of Status facet counts in Case Law Search #4539

Open
albertisfu opened this issue Oct 7, 2024 · 2 comments
Open

Optimize retrieval of Status facet counts in Case Law Search #4539

albertisfu opened this issue Oct 7, 2024 · 2 comments

Comments

@albertisfu
Copy link
Contributor

Currently, for every search request in Case Law, we also query the facet counts for the status checkboxes.

Screenshot 2024-10-07 at 3 35 48 p m

We could optimize this by:

  • Adding a cache for these facets. Similar to how we micro-cache results for each query, we can cache these counts, likely using the same cache duration.
  • We use an aggregation on the status field to get the counts. It may be possible to combine this with a cardinality aggregation so the counts are approximate, making the query faster. Some testing with the production index will be required to assess the new query before implementing it.
@mlissner
Copy link
Member

mlissner commented Oct 8, 2024

likely using the same cache duration.

Can we use the same cache key too?

the counts are approximate

With the result count, we were able to say "About XXX results" but we won't be able to do that here since saying "About" would take too much room.

I think this might be time to make a fun little helper library that gives us approximations:

Input Example In Output Example output
n < 1,000 201 The number itself 201
1,000 ≤ n ≤ 999,999 5,899 The number of thousands, rounded down, k 5k
1,000,000 ≤ n ≤ 999,999,999 24,233,000 The number of millions, a period, the number of hundreds of thousands, rounded down, then M 24.2M

I think that should cover the bases and would nicely imply that the count is approximate.

@albertisfu
Copy link
Contributor Author

Can we use the same cache key too?

Yes, we can use the same cache key. All the Opinions queries also return facet counts, so it's fine to store them together.

I think this might be time to make a fun little helper library that gives us approximations:

Great, yeah this should work nicely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

2 participants