Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JP-3721: Simplify ModelContainer #8831

Open
wants to merge 105 commits into
base: main
Choose a base branch
from
Open

Conversation

emolter
Copy link
Collaborator

@emolter emolter commented Sep 26, 2024

Resolves JP-3721

Closes #8738

With the addition of ModelLibrary, ModelContainer can be substantially simplified. ModelLibrary certainly renders the return_open and save_open options obsolete, as well as the get_sections method, since if memory usage is a concern then ModelLibrary should be used instead. Some additional discussion about how the ModelContainer used to perform way to many tasks can be found on this innerspace page.

During discussions related to JP-3715, it has become increasingly clear that the strictness of ModelLibrary and the additional borrow/shelve code required to access models is not necessary/desired for many use-cases. For example, the calwebb_spec3 pipeline currently makes extensive use of ModelContainer but does not have the same memory issues as calwebb_image3 does. Learning to use ModelLibrary may also be a nuisance for users manipulating relatively small datasets.

Therefore, ModelContainer should not be removed, but instead become a lightweight, easy-to-use class for loading a list of models and association metadata. This PR proposed a container satisfying the following constraints:

  • Loads association metadata dictionary
  • Can be manipulated as if it were a list
  • Loads all datamodels into memory (i.e., use ModelLibrary instead if we care about memory usage)
  • Is the default data structure loaded by datamodels.open() for asn-type data but is no longer itself a datamodel
  • Can be used as a context manager, i.e., with datamodels.open(asn.json) as container has the expected behavior

Linked PR in stdatamodels: spacetelescope/stdatamodels#330
Linked PR in stpipe: spacetelescope/stpipe#190

Tasks

  • request a review from someone specific, to avoid making the maintainers review every PR
  • add a build milestone, i.e. Build 11.3 (use the latest build if not sure)
  • Does this PR change user-facing code / API? (if not, label with no-changelog-entry-needed)
    • write news fragment(s) in changes/: echo "changed something" > changes/<PR#>.<changetype>.rst (see below for change types)
    • update or add relevant tests
    • update relevant docstrings and / or docs/ page
    • start a regression test and include a link to the running job (click here for instructions)
      • Do truth files need to be updated ("okified")?
        • after the reviewer has approved these changes, run okify_regtests to update the truth files
  • if a JIRA ticket exists, make sure it is resolved properly
news fragment change types...
  • changes/<PR#>.general.rst: infrastructure or miscellaneous change
  • changes/<PR#>.docs.rst
  • changes/<PR#>.stpipe.rst
  • changes/<PR#>.datamodels.rst
  • changes/<PR#>.scripts.rst
  • changes/<PR#>.fits_generator.rst
  • changes/<PR#>.set_telescope_pointing.rst
  • changes/<PR#>.pipeline.rst

stage 1

  • changes/<PR#>.group_scale.rst
  • changes/<PR#>.dq_init.rst
  • changes/<PR#>.emicorr.rst
  • changes/<PR#>.saturation.rst
  • changes/<PR#>.ipc.rst
  • changes/<PR#>.firstframe.rst
  • changes/<PR#>.lastframe.rst
  • changes/<PR#>.reset.rst
  • changes/<PR#>.superbias.rst
  • changes/<PR#>.refpix.rst
  • changes/<PR#>.linearity.rst
  • changes/<PR#>.rscd.rst
  • changes/<PR#>.persistence.rst
  • changes/<PR#>.dark_current.rst
  • changes/<PR#>.charge_migration.rst
  • changes/<PR#>.jump.rst
  • changes/<PR#>.clean_flicker_noise.rst
  • changes/<PR#>.ramp_fitting.rst
  • changes/<PR#>.gain_scale.rst

stage 2

  • changes/<PR#>.assign_wcs.rst
  • changes/<PR#>.badpix_selfcal.rst
  • changes/<PR#>.msaflagopen.rst
  • changes/<PR#>.nsclean.rst
  • changes/<PR#>.imprint.rst
  • changes/<PR#>.background.rst
  • changes/<PR#>.extract_2d.rst
  • changes/<PR#>.master_background.rst
  • changes/<PR#>.wavecorr.rst
  • changes/<PR#>.srctype.rst
  • changes/<PR#>.straylight.rst
  • changes/<PR#>.wfss_contam.rst
  • changes/<PR#>.flatfield.rst
  • changes/<PR#>.fringe.rst
  • changes/<PR#>.pathloss.rst
  • changes/<PR#>.barshadow.rst
  • changes/<PR#>.photom.rst
  • changes/<PR#>.pixel_replace.rst
  • changes/<PR#>.resample_spec.rst
  • changes/<PR#>.residual_fringe.rst
  • changes/<PR#>.cube_build.rst
  • changes/<PR#>.extract_1d.rst
  • changes/<PR#>.resample.rst

stage 3

  • changes/<PR#>.assign_mtwcs.rst
  • changes/<PR#>.mrs_imatch.rst
  • changes/<PR#>.tweakreg.rst
  • changes/<PR#>.skymatch.rst
  • changes/<PR#>.exp_to_source.rst
  • changes/<PR#>.outlier_detection.rst
  • changes/<PR#>.tso_photometry.rst
  • changes/<PR#>.stack_refs.rst
  • changes/<PR#>.align_refs.rst
  • changes/<PR#>.klip.rst
  • changes/<PR#>.spectral_leak.rst
  • changes/<PR#>.source_catalog.rst
  • changes/<PR#>.combine_1d.rst
  • changes/<PR#>.ami.rst

other

  • changes/<PR#>.wfs_combine.rst
  • changes/<PR#>.white_light.rst
  • changes/<PR#>.cube_skymatch.rst
  • changes/<PR#>.engdb_tools.rst
  • changes/<PR#>.guider_cds.rst

braingram and others added 30 commits July 29, 2024 16:07
Copy link
Contributor

@perrygreenfield perrygreenfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only had minor comments so far. In the end documentation on what the new ModelContainer class does and supports with reference to the changes relative to the older version should be provided.

jwst/datamodels/tests/data/association.json Show resolved Hide resolved
jwst/datamodels/container.py Outdated Show resolved Hide resolved
jwst/extract_1d/extract.py Outdated Show resolved Hide resolved
jwst/pipeline/calwebb_spec3.py Outdated Show resolved Hide resolved
jwst/badpix_selfcal/badpix_selfcal_step.py Outdated Show resolved Hide resolved
jwst/datamodels/tests/test_open_association.py Outdated Show resolved Hide resolved
@emolter
Copy link
Collaborator Author

emolter commented Oct 1, 2024

I only had minor comments so far. In the end documentation on what the new ModelContainer class does and supports with reference to the changes relative to the older version should be provided.

Thanks for the review Perry. I agree we should add some documentation. I'm still wondering what people think of the new-look container overall - is it complementary to ModelLibrary, does it make sense as a data structure, do any methods still need to be removed?

@emolter
Copy link
Collaborator Author

emolter commented Oct 10, 2024

After lots of discussion, it's been decided that this PR should attempt to avoid needing to make any stpipe changes. This requires, among other things, putting back the save method. Regression tests with that change and stpipe pinned to main started here, to figure out what problems the current ModelContainer has when attempting to interface with stpipe/main.

edit: all regtests are passing

@emolter emolter added this to the Build 11.2 milestone Oct 11, 2024
@emolter emolter marked this pull request as ready for review October 11, 2024 13:33
@emolter emolter requested a review from a team as a code owner October 11, 2024 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Simplify ModelContainer
3 participants