Optimize retrieval of Status facet counts in Case Law Search #4539

albertisfu · 2024-10-07T21:41:54Z

Currently, for every search request in Case Law, we also query the facet counts for the status checkboxes.

We could optimize this by:

Adding a cache for these facets. Similar to how we micro-cache results for each query, we can cache these counts, likely using the same cache duration.
We use an aggregation on the status field to get the counts. It may be possible to combine this with a cardinality aggregation so the counts are approximate, making the query faster. Some testing with the production index will be required to assess the new query before implementing it.

The text was updated successfully, but these errors were encountered:

mlissner · 2024-10-08T21:11:28Z

likely using the same cache duration.

Can we use the same cache key too?

the counts are approximate

With the result count, we were able to say "About XXX results" but we won't be able to do that here since saying "About" would take too much room.

I think this might be time to make a fun little helper library that gives us approximations:

Input	Example In	Output	Example output
n < 1,000	201	The number itself	201
1,000 ≤ n ≤ 999,999	5,899	The number of thousands, rounded down, k	5k
1,000,000 ≤ n ≤ 999,999,999	24,233,000	The number of millions, a period, the number of hundreds of thousands, rounded down, then M	24.2M

I think that should cover the bases and would nicely imply that the count is approximate.

albertisfu · 2024-10-08T21:53:28Z

Can we use the same cache key too?

Yes, we can use the same cache key. All the Opinions queries also return facet counts, so it's fine to store them together.

I think this might be time to make a fun little helper library that gives us approximations:

Great, yeah this should work nicely.

Provide feedback