Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Pagination Features for FHIRClient and Bundle Classes #174

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

LanaNYC
Copy link
Contributor

@LanaNYC LanaNYC commented Sep 3, 2024

Overview

This pull request aims to address pagination functionalities as outlined in issue #172. The implementation includes the following key changes:

  • Refactoring FHIRSearch with iterators.
  • Adding __iter__ and __next__ methods to the Bundle class to handle pagination.
  • Moving pagination logic to _utils.py for better modularity and reuse.
  • Adjusted tests to accommodate the new generator-based approach.
  • Updated the fhir-parser submodule to reflect the necessary changes in Bundle generation.

Cross-Reference: This PR depends on changes in the fhir-parser repository that modify the Bundle class to support generator behavior.
Please see the related PR in the fhir-parser repository: smart-on-fhir/fhir-parser#53.

Backporting: Changes were first introduced in client-py by modifying the Bundle class directly. This was backported to fhir-parser to maintain consistency and extend generator behavior to Bundle across the libraries.

@mikix
Copy link
Contributor

mikix commented Sep 4, 2024

I would greatly appreciate any guidance from the maintainers or community members on resolving the Superfluous entry error. This will allow me to complete the testing phase and ensure the robustness of the new pagination features.

I believe this is because of some bad indenting in fhirclient/models/bundle.py - def elementProperties lost its indenting and is now a toplevel function instead of a class method. Adding back the indenting seems to surface more reasonable errors.

Note that manually editing bundle.py like that is fraught - most model files like that will get overwritten the next time ./generate_models.sh is run. But that's fine! We have options:

  • Copy/paste a copy of bundle.py and put it in ./fhir-parser-resources/ and edit the generation machinery to copy it over (this has downsides of code-drift over time)
  • Make some of these changes in fhir-parser first, then update this repo to consume the changes (forward-port the changes)
  • Make the changes here, where it's easier/faster, then go back and back-port the changes to fhir-parser too (and then probably also update this repo on top of that, removing the custom changes to get fully in sync again)

I'm down for whatever - that last option is probably easiest for just getting something that works. And maybe the solution also depends on whether we think this feature "belongs" in fhir-parser. Just wanted to flag the close relationship these two repos have.

@LanaNYC
Copy link
Contributor Author

LanaNYC commented Sep 9, 2024

@mikix Thank you. I'll have more time this week and hope to finish this PR.

@LanaNYC LanaNYC changed the title WIP: Implement Pagination Features for FHIRClient and Bundle Classes Implement Pagination Features for FHIRClient and Bundle Classes Sep 20, 2024
…next to Bundle, and moved pagination logic to _utils.py
Copy link
Contributor

@mikix mikix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thank you! Comments below

raise e


def iter_pages(first_bundle: 'Bundle') -> Iterable['Bundle']:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this type hinting work without fully specifying it? Like 'fhirclient.models.bundle.Bundle'? If so, great. If not and that seems too wordy for you, I think you could also do something like:

from typing import TYPE_CHECKING

if TYPE_CHECKING:
    from fhirclient.models.bundle import Bundle

("type", "type", str, False, None, True),
])
return js
js = super(Bundle, self).elementProperties()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section feels like accidental over-indentation?

Comment on lines +54 to +68
def __iter__(self):
""" Makes the Bundle itself an iterator by returning an iterator over its entries. """
if self.entry is None:
self._entry_iter = iter([])
else:
self._entry_iter = iter(self.entry)
return self

def __next__(self):
""" Returns the next BundleEntry in the Bundle's entry list using the internal iterator. """
# return next(self._entry_iter)

if not hasattr(self, '_entry_iter'):
self.__iter__()
return next(self._entry_iter)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could this whole thing just be:

def __iter__(self):
  return iter(self.entry or [])

Like, what's the advantage of making Bundle an iterator itself instead of just an iterable? (Honest question, I'm not deep in the iterable weeds)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also... having read both PRs now - maybe adding direct iteration support to Bundle is not worth the effort? Like the code in _util.py in this PR could handle iterating on entries and we wouldn't need any fhir-parser changes after all?

That said! I think it's cool to have the direct iteration. But is a new direction for us - adding custom cool Python features to individual classes. Which is nice! But if you want to avoid that complexity here, I'd be happy with all this iteration logic just living as fhirsearch-specific logic.

Copy link
Contributor

@mikix mikix Sep 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I know I sent you down the path of fhir-parser - so this is probably annoying to hear - I am now just thinking that everyone is likely going to get their Bundles from fhirsearch anyway... I guess it shows up in other places... So direct iteration is still useful... But I think in my head, the iteration would be the pagination itself - which is annoying to do oneself and thus helpful for us to provide. But if the iteration is over the entries, and we do pagination all in fhirsearch, maybe this bit of synctatic sugar can be left off for now, since it reduces the PR complexity a fair bit.)

Copy link
Contributor

@mikix mikix Sep 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And to be clear, iterating over the entries makes sense too - my initial thought was over the pages, but entries feels more natural (for entry in bundle:). So I'm not advocating for changing the iteration to pagination. I'm just realizing that iterating over entries is a small bit of syntactic sugar instead of a big one.

if not first_bundle:
return iter([])
yield first_bundle

def perform_resources(self, server):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: while we're here, maybe add -> list['Resource']: on this method, which should help differentiate how this is different than the _iter version.

(And probably add -> 'Bundle': to perform() too.)



# Use forward references to avoid circular imports
def perform_iter(self, server) -> Iterator['Resource']:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this is written, it will iterate on BundleEntry objects, yeah? But we actually want Bundles.

Should this be re-written as return iter_pages(self.perform(server)) (and change the return hint to Iterator['Bundle']?

Comment on lines +153 to +158
# Old method with deprecation warning
warnings.warn(
"perform_resources() is deprecated and will be removed in a future release. "
"Please use perform_resources_iter() instead.",
DeprecationWarning,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fun - I almost never use the warnings module.

I would personally say that we should deprecate perform too - any search can yield multiple Bundles, and not treating a search that way from the get go is likely to encourage buggy code.

When we remove perform and perform_resources, I was thinking we might just rename the iter version into the base versions. Which maybe deserves a different phrasing than "will be removed" -- but maybe that's a bad plan, to cause "needless" name churn. The _iter isn't hurting anyone. So I guess ignore this paragraph. 😄

Comment on lines 160 to 162
bundle = self.perform(server)
resources = []
if bundle is not None and bundle.entry is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this code could just be return list(self.perform_resources_iter()) yeah? (just to reduce duplicated code paths, and to emphasize that the method is now just a "worse" version of the _iter one)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, that change would be a behavior change - but I think an important one? Previously, this did no pagination. But that just means that we'd throw resources on the floor with no indication we had done so or way to work around that.

So I think switching this method to use pagination under the hood is a valuable fix.

Comment on lines +178 to +179
if not first_bundle or not first_bundle.entry:
return iter([])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check isn't needed, is it? perform can't return None and the not .entry check is handled by __iter__.

This method could be written as the following, I think?

for bundle in self.perform_iter(server):
  for entry in bundle:
    yield entry.resource

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding these tests, love to see it ❤️

@LanaNYC
Copy link
Contributor Author

LanaNYC commented Sep 25, 2024

@mikix Thank you for detailed feedback. I'll look at everything this weekend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants