-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A new interface for output #3793
base: main
Are you sure you want to change the base?
Conversation
There's an intriguing side benefit of this "unified" interface for output. It means that it is possible (though we don't have it now) for users to specify an "output preference" in a |
Another abstraction I think would be useful is a utility for building multiple outputs. Imagine this: indices = (xy=(:, :, k), xz=(:, 1, :), yz=(1, :, :))
sliced_outputs!(simulation, outputs, indices; schedule=TimeInterval(1), filename="sliced") this would append I personally find myself using |
Maybe I'm personally quite picky about specifying outputs and output file names, so I might always end up with verbose boilerplate for output writing (and I'm personally fine with that). But I'd support reducing boilerplate and maybe just a bit of flexibility would work even for picky people! I remember having this conversation in #1171 actually! One-line flexible output writing would be especially great for examples, new user friendliness, and quick iteration. Some thoughts:
Sounds good! Will be nice for derivatives to work by default. Although
Is the "so much boilerplate" just the extra one line But I really like the suggestions in #3543 of having the option to save output in unique directories be easily specifiable. If we want an easy default, then maybe it could do some version of the unique directories? Or maybe have I'm not sure of the best approach but as someone who's conservative about overwriting by default I'm tempted to err on the side of caution.
Love this idea! Hoping that you can also pass e.g. |
Totally and to be clear, when we think about the economy of an interface, we are thinking about prototyping, illustrating, testing, not necessarily "production". I think "production" places fewer demands on the user interface and what we have now is ok for production. This PR mainly improves the small stuff. Also arguably it's more helpful for experienced than new users.
I agree that with "add" and "writer" the meaning is cemented. I think it's important to recognize trade-offs though, because there is a limit to the benefit of being explicit (when things become hard to read or understand). I think in this case I accept that
Yes for sure! In that example the keys "xy", "xz", etc would be names appended to the filename prefix.
Do you run with this option? Curious because I never use it. I think the cost of losing data is actually usually very small, it's only in a small 1% of cases that the data is valuable. I think that's actually the key insight behind the default, that expensive simulations are rare so it doesn't make sense to default it. |
It's one line --- but it's in every script, sometimes many times! Add all that up and you get to a huge amount... |
@navidcy any thoughts? I think the main feedback is to keep |
I agree with @ali-ramadhan, keeping a Another issue to add into this discussion is the fact of how to handle output (e.g. |
Do you run with |
I propose handling this by initializing output files within |
Ok, I see. I agree that that discussion can be left to the other PR. |
I agree, that is not a common use case scenario. I have only used |
Thank you for pointing out this use case. I think this is another situation that could be solved by waiting until Another idea by the way would be to move the concept of "overwriting" to |
Another idea after talking with @josuemtzmo: add another function called |
Another utility that I believe is needed is a function that displays the information in an output file. For example something like julia> outputinfo(filename) which displays things like
anything else? |
I'm not too picky about
I do actually haha. But what I also do is set different output directories for each run so I can always go back and compare different runs as I'm iteratively modifying stuff. So I guess setting different directories is guaranteeing that I never overwrite existing files, but then
We might almost already have this with the |
I've long been unsatisfied with how we build output. It requires a lot of typing --- that is, boilerplate. I often feel a sense of dread when I have to go beyond "visualizing the final iteration" of a prototype to defining an output writer. So much typing.
This PR is an attempt to make output easier and more fun. I use JLD2 as an example but if there is some consensus then I think this PR should extend the same to NetCDF.
The main thrust of this PR is a new function called
output!
. It works like this:The default is
JLD2Format()
. ForNetCDF
users would writeThe function adds an output writer to
simulation
, choosing a "generic name" for thesimulation.output_writers
dictionary. Does this enable one line output writing?I'd love to hear feedback about this design. I implemented it in the
two_dimensional_turbulence.jl
example for illustration:Oceananigans.jl/examples/two_dimensional_turbulence.jl
Line 109 in 6a6853d
There are two more things. First, we need to default
with_halos=true
for JLD2OutputWriter. The time has come becauseFieldTimeSeries
is mature. We do, in fact, want halos.The other conundrum is
overwrite_existing
which is discussed on #3543. In short I am wondering whether the best course of action is simply to make defaultoverwrite_existing=true
and solve so much boilerplate. Simulations are cheap, but life is short!PS I also want to change
add_callback!
to justcallback!
.