Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cellarea docs, allow Literate.jl tutorials in the doc pipeline, fix typos #800

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

asinghvi17
Copy link
Collaborator

@asinghvi17 asinghvi17 commented Oct 13, 2024

  • Switch all references for reproject from GDAL to Proj, improve docstring
  • Add Proj to the docs project
  • Slightly clarify some doc examples
  • Add Literate.jl example pipeline
  • Add detailed doc page and exploration of cellarea

@asinghvi17
Copy link
Collaborator Author

Supersedes #789

@asinghvi17 asinghvi17 mentioned this pull request Oct 13, 2024
5 tasks
@asinghvi17 asinghvi17 requested review from rafaqz and lazarusA and removed request for rafaqz October 13, 2024 13:32
Copy link
Owner

@rafaqz rafaqz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main thought here is why doesn't the cellarea tutorial just go in the cellarea API docs? (With a link from toolbars)

Or in extended docs? This approach feels like a lot of text needs to be repeated and the overall complexity of the documentation is increased.

To me tutorials should be more about combining multiple methods into workflows.

(Or maybe when one method just has so many options to explain it needs a lot of examples, rasterization may fit this case)

@@ -2,15 +2,16 @@
"""
reproject(obj; crs)

Reproject the lookups of `obj` to a different crs.
Reproject the lookups (axes) of `obj` to a different crs.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We try to maintain the distinction between axes and lookups as different things

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you're saying, the problem is that even people who've used rasters for a while have no clue what "lookups" are. There has to be some frame of reference. "Dimension values" is also super unclear - what does that mean?

Maybe we can rewrite the parentheses to be more clear? Would "axis values" work there?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, this is meant to be for people who have no clue what cellarea is or why you might want it. So you have to go from the ground up, more or less.

Copy link
Owner

@rafaqz rafaqz Oct 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get it, but we just can't use a common base function name that also works on a Raster to explain a DD function that's really a different thing.

It's clearly difficult to find words that describe lookups without semantic overloads (see AxisKeys.jl... keys are a different base method again) but it's important that we try.

I think axis values is helpful, but maybe values is also a pretty empty word. Maybe this needs to be systematic from DD up, we could workshop the language over there.

(FWIW lookup was intentionally chosen to avoid the overloads with base concepts that other packages have. The problem with avoiding overloads is you end up with a less common word)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could say "x and y values" for now?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we're back to the problem of "what is a lookup" again?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could say "location information". But I would prefer to just use lookups here and have a glossary where we would explain these differences, because otherwise we would have to duplicate this info everywhere we use lookups.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue that we should duplicate this information everywhere we use lookups, at least in top-level, user accessible functions. I don't want to overestimate the curiosity of a new user v/s their unwillingness to click a link.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not to say that we shouldn't have a glossary with a more detailed explanation, but there should be something like:

[lookup values](link to DD lookup docs) (the positions of the cells in each dimension)

or something

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually "lookup values (indicating the positions of each cell)" sounds pretty descriptive

@lazarusA
Copy link
Collaborator

your cellarea tutorial is not running the lines as intended (no output in most). Plus, is hard to find, I needed to look for it in the search box, maybe add a new section to the nav bar?

@lazarusA
Copy link
Collaborator

In yax, we have something like, although is not very much maintained 🫨 .

Screenshot 2024-10-13 at 15 44 46

@lazarusA
Copy link
Collaborator

@rafaqz IMHO, API docs should not have extended docs, and be as clean as possible to the main description of input and output arguments. For developers it might be easier to see things there if they work as intended, another test, but documenting should have a low entry level, specially for people (me) that don't know how to use those functions 😄 .

@rafaqz
Copy link
Owner

rafaqz commented Oct 13, 2024

Sure, I just meant using extended docs so you can add more examples for the online docs that don't swamp the repl. Not so you would actually look for them in the repl.

My main point is not to have cellarea docs twice, I don't want too much duplicate text to maintain.

@asinghvi17
Copy link
Collaborator Author

asinghvi17 commented Oct 13, 2024

Even extended docs are a little button you can click on online, I don't see an alternative to tutorials unfortunately. These also serve very different uses since the tutorials are to introduce someone to a function and what you can do with it, REPL docs are to remind you of what the function itself does and how to call it

@rafaqz
Copy link
Owner

rafaqz commented Oct 13, 2024

rasterio also just uses the api docs for functions like this
https://rasterio.readthedocs.io/en/stable/cli.html#rasterize

And their tutorials are reserved for general things (though I'm not saying their docs are great)

From my perspective, contributors here and for DD have come and gone, and in some years there is a good chance I will have to maintain these docs. Minimizing that task across what are now very many packages is becoming central to maintaining my sanity.

If you make tutorials for some Rasters.jl methods, at some point we are committing to making and maintaining them for all methods.

I really want to try putting these tutorials in the regular API docs, and changing that when they are really too big and we are committed to fleshing out the docs for every other method here.

(possibly some more tutorial style stuff can go in the geocompx book too?)

@rafaqz
Copy link
Owner

rafaqz commented Oct 13, 2024

Currently we have methods listed here:
https://rafaqz.github.io/Rasters.jl/stable/methods

They could be better organised in a menu somehow. But to me just improving those is better than adding additional versions of them all. Like cellarea isn't even listed there yet.

@asinghvi17
Copy link
Collaborator Author

asinghvi17 commented Oct 13, 2024

My issue with putting these in docstrings is twofold:

  • They don't run, so they can't be used as additional sanity checks. Output may change over time and the REPL examples in the docstrings would stay static. This is another source of maintenance burden.

  • The kind of examples I want to show (which are, fundamentally, full-workflow examples) are unsuitable to REPL examples because they aren't visual enough. In the REPL, I have no way to show a raster's values without completely overwhelming the user with information. A plot, on the other hand, is simple and immediately understandable.

    Looking at the rasterio docs you sent - when will I ever actually run those commands? If I run them, how do I see the output? Those docs aren't the best, and a user will get disheartened seeing that they have to do things. For that matter, when will a user ever run the example commands in the REPL docs, if they don't know they are there? This is the missing source of information I want to correct here. People don't fundamentally care unless things are immediately in front of them, so they can see what it does and see the benefit.

That being said, I can see your point about implicit promises and maintaining stuff. How about this:

  • Move the cellarea tutorial to a new "Estimating spatial means" tutorial (and keeping the tutorial structure)
  • Move some of the explanatory text here to the cellarea docs

@asinghvi17
Copy link
Collaborator Author

If the goal of the geocompx books is to replicate the R and Python versions directly then I'm not sure if we can sneak these examples in there, though there will definitely be more examples of these kinds of functions.

@rafaqz
Copy link
Owner

rafaqz commented Oct 13, 2024

The Rasters API docs already have examples with plots! And they do actually run, as doctests :)

I like the split - estimating spatial means is a complete workflow task, perfect for a tutorial

It can link to the cellarea API docs for method details

@rafaqz
Copy link
Owner

rafaqz commented Oct 13, 2024

I actually think all of the current API docs can be better and longer with more explanation and examples, I just haven't had time


But fundamentally, this is all that `zonal` is doing under the hood -
masking and cropping the raster to the geometry, and then computing the statistic.
=#
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from experience (a painful one), I can tell you that for long explanations using literate and .jl files is not ideal. I did that for a while in the yax docs, and eventually I went back to just markdown. For small scripts that just run is fine, like in beautifulmakie, but once you start doing bigger things I would say that markdown should be the way to go.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, why not? It's worked pretty well for me so far, but those were all in situations where my markdown content doesn't exceed ~150 lines. Curious what your issues were...

Copy link
Collaborator Author

@asinghvi17 asinghvi17 Oct 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do at some point want to integrate a Quarto like system into most of my docs, since that seems to be the best at this stuff.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @lazarusA, Literate.jl adds a layer of complexity that's just not worth it in some cases.

I would prefer not to have it here

# To construct a raster, we'll need to specify the x and y dimensions. These are called "lookups" in Rasters.jl.
using Rasters.Lookups
# We can now construct the x and y lookups. Here we'll use a start-at-one, step-by-five grid.
# Note that we're specifying that the "sampling", i.e., what the coordinates actually mean,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Note that we're specifying that the "sampling", i.e., what the coordinates actually mean,
# Note that we're specifying the "sampling", i.e., what the coordinates actually mean,

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might take this section out, not sure yet.

# Now, let's compute the average precipitation per square meter across Chile.
# First, we need to get the area of each cell in square meters. We'll use the spherical method, since we're working with a geographic coordinate system. This is the default.
areas = cellarea(masked_precip)
masked_areas = mask(areas; with = chile)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have to mask again?

Copy link
Collaborator Author

@asinghvi17 asinghvi17 Oct 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#797, but this should be fixed once the cf branch is merged.

@asinghvi17
Copy link
Collaborator Author

asinghvi17 commented Oct 15, 2024

@rafaqz now I remember why I was against jldoctest blocks - they don't show up in the REPL at all! So they're fundamentally useless for in-REPL documentation. Unless there's some module level switch to change this (but I don't think there is...)

@asinghvi17
Copy link
Collaborator Author

Ah I guess we're missing the julia> prompt (that's a super weird thing to require IMO)

KristofferC/OhMyREPL.jl#293 (comment)

@rafaqz
Copy link
Owner

rafaqz commented Oct 15, 2024

Yeah for me e.g. rasterize has a full code example

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants