Skip to content

Conversation

@melanieclarke
Copy link
Collaborator

@melanieclarke melanieclarke commented Aug 4, 2025

Partially resolves JP-3930

Continuing work on making sure input models are not modified by Steps and following up on #9709, which introduced some new data copies for some outlier_detection use cases.

For outlier_detection, I added an internal function that decided from the input whether a copy was needed, and returned an open model. I realized something similar was needed for skymatch and tweakreg, so I'm proposing here to add a Step.prepare_output function to more generally support this use case, and expand its use to the other image3 steps that expect ModelLibrary input.

This is drawing on @braingram's suggestion here: #8588 (comment)
Unlike the proposed function in that PR, though, this one would explicitly support the expectation that input models are not modified after they are provided to a step or pipeline. Also unlike that function, this is not meant to be used as with self.open_model(input_data) as model: at the top of every step, because it will vary by use case whether the open model should be closed within the step.

Since the Step function can check whether it is part of a pipeline, I added one extra clause to the implementation from the OutlierDetectionStep: if self.parent is not None, copies are not made. This ends up gaining back the memory costs introduced in #9709.

I think we could potentially expand the use of this function to the remaining Steps, and use it to be more intentional about when we copy input data and when we close the input models. If we don't make shallow copies with with datamodels.open(model), and we do make deep copies when needed, then the only time the opened model should need to be closed within a step is if the output is a completely new model generated within the step (not a modified version of the input model).

See also the discussion here: spacetelescope/stdatamodels#512

Tasks

  • If you have a specific reviewer in mind, tag them.
  • add a build milestone, i.e. Build 12.0 (use the latest build if not sure)
  • Does this PR change user-facing code / API? (if not, label with no-changelog-entry-needed)
    • write news fragment(s) in changes/: echo "changed something" > changes/<PR#>.<changetype>.rst (see changelog readme for instructions)
    • update or add relevant tests
    • update relevant docstrings and / or docs/ page
    • start a regression test and include a link to the running job (click here for instructions)
      • Do truth files need to be updated ("okified")?
        • after the reviewer has approved these changes, run okify_regtests to update the truth files
  • if a JIRA ticket exists, make sure it is resolved properly

@melanieclarke
Copy link
Collaborator Author

melanieclarke commented Aug 4, 2025

Initial regtests here:
https://github.com/spacetelescope/RegressionTests/actions/runs/16731840817

Comparing to regtests for #9709, performance is restored to what it was before the change in all cases.

@melanieclarke
Copy link
Collaborator Author

@emolter @braingram @tapastro @penaguerrero - I'd like your thoughts on this proposed change to Step.open_model, when you have a chance. I'll leave it at draft for now.

@codecov
Copy link

codecov bot commented Aug 4, 2025

Codecov Report

❌ Patch coverage is 95.34884% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.43%. Comparing base (94c0254) to head (ad36c6c).
⚠️ Report is 7 commits behind head on main.

Files with missing lines Patch % Lines
jwst/datamodels/container.py 0.00% 1 Missing ⚠️
jwst/tweakreg/tweakreg_step.py 75.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #9723      +/-   ##
==========================================
+ Coverage   83.38%   83.43%   +0.05%     
==========================================
  Files         366      366              
  Lines       37770    37784      +14     
==========================================
+ Hits        31493    31525      +32     
+ Misses       6277     6259      -18     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@emolter
Copy link
Collaborator

emolter commented Aug 5, 2025

Thanks for continuing to work on this Melanie! Overall I think something like this would be very beneficial. To re-state what I view as the design requirements,

  1. Steps or Pipelines should never modify their user-specified input model. This requires making a copy sometimes.
  2. Steps or Pipelines should avoid making copies whenever possible.
  3. All datamodels need to be closed upon exiting a Pipeline, or a Step that was called standalone, if they were not open to begin with.

I think your proposal achieves all of these, but I'm still trying to work through when/if closing needs to happen. My expectations would be as follows:
If I do

fname = "foo.fits"
result = ResampleStep.call(fname)

my expectation would be that result is an open datamodel but it's the only open datamodel. But if I do

model = dm.open(fname)
result = ResampleStep.call(model)

now I expect model and result to both be open datamodels. If I do

result = Image3Pipeline.call(fname)

I would expect that the individual Steps within the pipeline will keep the model open when passed between them, to avoid having to close and re-open it.

Finally, if I call the pipeline from the command-line, I'd expect result to also close itself at the end.

Can you say a bit more about how this works?

@melanieclarke
Copy link
Collaborator Author

Thanks for taking a look, Ned. I agree with all your requirements and expectations, except I might restate # 3 as:
"All datamodels opened within a Pipeline, or a Step that was called standalone, need to be closed upon exiting if they were not open to begin with, unless they are returned as the result of the step." That is, open input models should stay open and result models should be returned open.

The way I would envision this working in the future is:

  1. A Step calls self.open_model on its input: output_model = self.open_model(input_data)
  2. If the model type is not changed by the step (e.g. flat correction), it proceeds to modify output_model as needed, and return it. No need for explicit copies, and no need to close anything.
    • If input_data was a model and the step is called standalone, it is still open and unmodified since a copy was made. The output_model is returned open.
    • If input_data was a model and the step is called as part of a pipeline, output_model is input_data and it remains open, but is modified in place.
    • If input_data was a file, only one model is opened, and it is returned as output_model.
  3. If a new model is generated by the step (e.g. resampling), it creates the new model with reference to output_model. Before returning, if output_model is input_data, leave it open; otherwise, close it because it is no longer needed. Return the new model as the result.
    • If input_data was a model and the step is called standalone, output_model is a copy. This is usually a necessary copy, because if the step is skipped, the status is set in output_model and it is returned open, as the result. If the step completes, the output_model copy is closed before returning and only the new model and the input_data stay open.
    • If input_data was a model and the step is called as part of a pipeline, output_model is input_data and it remains open. If the calling pipeline no longer needs input_data after the step completes, it can close it.
    • If input_data was a file, output_model is opened from it, but closed before returning, so only the new model is open after the step completes.

@penaguerrero
Copy link
Contributor

Thanks for this work, Melanie! I think the code you propose is a good generalization for both steps that take regular models as well as steps that take a model container or library as input, and I agree that it is better to decide to make this copy or not at a higher level than the step. This should improve the memory footprint overall in all pipelines! When I did the test for run for Detector1 in #8588 (comment), the memory footprint in my laptop for that heavy 2.6 GB uncal MIRI file went down from about 60 GB to about 40 GB. I am curious to see the changes with your proposal.

Copy link
Collaborator

@braingram braingram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together. I left some inline comments and overall the approach looks good to me.

@melanieclarke
Copy link
Collaborator Author

melanieclarke commented Aug 6, 2025

It sounds like everyone so far generally likes this approach, so I will work on adding some more tests to verify the prepare_outputs function in different use cases. I'll still leave it in draft until Tyler has a chance to weigh in.

Edit: tests added in bcd0064

@penaguerrero
Copy link
Contributor

I remembered that when I worked on this for all the steps of detector 1, I made a copy when necessary and then immediately deleted the original object. This reduced memory quite a bit.

@melanieclarke
Copy link
Collaborator Author

I remembered that when I worked on this for all the steps of detector 1, I made a copy when necessary and then immediately deleted the original object. This reduced memory quite a bit.

I think that won't work in general, because it would violate the user assumption that if they provide a datamodel to the step, it will still exist after the step completes. I'm hoping, though, that we can get similar savings by not making copies when it's not needed. We'll also definitely need to give some thought to places where we can usefully delete objects after they're no longer needed, especially within the pipelines, which often have more knowledge about which objects are needed than the steps do.

@melanieclarke melanieclarke changed the title WIP JP-3930: Step function for opening models JP-3930: Step function for opening models Aug 7, 2025
@melanieclarke melanieclarke added this to the Build 12.1 milestone Aug 7, 2025
@melanieclarke melanieclarke marked this pull request as ready for review September 15, 2025 15:04
@melanieclarke melanieclarke requested review from a team as code owners September 15, 2025 15:04
@melanieclarke
Copy link
Collaborator Author

@tapastro - I'm taking this PR out of draft because it would be nice to get into this build if possible. It restores performance to previous benchmarks for some stage 3 use cases.

@melanieclarke
Copy link
Collaborator Author

@emolter
Copy link
Collaborator

emolter commented Sep 15, 2025

The new test test_ngroup_1 is failing on the regtest run. It looks like just a network blip but since the test is new it'd be nice to be sure, would you mind rerunning those?

Would it be helpful for me to re-review this?

@melanieclarke
Copy link
Collaborator Author

melanieclarke commented Sep 15, 2025

Would it be helpful for me to re-review this?

Yes please! It's been a minute, so would be very helpful to have you recheck it.

I'll rerun the regtests.

Edit for regtest results: test_ngroup_1 is passing, but there are unrelated failures from a MIRI CRDS delivery. I'll run again when those are okified.

Edit again: all passing now.

Copy link
Collaborator

@emolter emolter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the detailed test suite! This looks good to me - regtests for memory usage were helpful to determine that this is doing what we expect.

@tapastro tapastro merged commit f76fe12 into spacetelescope:main Sep 16, 2025
28 checks passed
@melanieclarke melanieclarke deleted the open_model branch September 16, 2025 18:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants