-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
User-friendly (ish) interface for metadata extraction #39
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small suggested changes but looks good.
Guess big question I'm interested in is why you wanted the type conversion to be explicit instead of leaving it up to R to convert (which generally works fairly well). Just thinking ahead to other language implementations of this where the automatic conversion might not be possible or the same?
R/outpack_tools.R
Outdated
##' but you can always do this yourself. | ||
##' | ||
##' In order to use function you need to know what metadata are | ||
##' available; we will expand a the vignette with more worked examples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
##' available; we will expand a the vignette with more worked examples | |
##' available; we will expand the vignette with more worked examples |
R/outpack_tools.R
Outdated
##' (per packet), each with columns `packet` (id), `query` (string, | ||
##' used to find `packet`) and `files` (another `data.frame` with | ||
##' columns `there` and `here` corresponding to filenames upstream | ||
##' and in this packet, repsectively) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
##' and in this packet, repsectively) | |
##' and in this packet, respectively) |
##' You must not provide `id`; it is always returned and always first | ||
##' as a character vector column. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this error if you provide id
? Wonder if that should just do nothing if users specify it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently, yes it errored. I wondered about being nice and just skipping over it but decided that every time I try and be helpful it creates complicated things to maintain later, so just threw instead. We can always walk that back later
R/outpack_tools.R
Outdated
##' Note that this does not do any coersion to number, it will error | ||
##' if a non-NULL non-numeric value is found. Valid types for use | ||
##' with `is <type>` are `boolean`, `number` and `string` (note that | ||
##' these differ slightly from R's names because we want to emphasise | ||
##' that these are *scalar* quantities; also note that there is no | ||
##' `integer` here as this may produce unexpected errors with | ||
##' integer-like numeric values). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And list
?
R/outpack_tools.R
Outdated
for (i in seq_len(nrow(extract))) { | ||
from_i <- extract$from[[i]] | ||
is_i <- extract$is[[i]] | ||
value_i <- lapply(meta, function(x) x[[from_i]]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did not know you could index into a list like that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, total madness. This was the bit that I meant when I said "bits that you'd think are hard end up easy"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this did remind me though that I wanted to prevent the opaque error you get when this fails (done now)
I don't think this works well in practice (thinking here of why we avoid |
This has been requested in a few guises over the years, and I think it will perhaps form the basis of other tooling too.
Extract some fields from the metadata of a bunch of packets. The implementation is fairly rich, and is described in some detail in the docs (which I used to sketch out the interface before implementation). I think this strikes a reasonable balance of usability, generalisability and extensibility, but very open to suggestions.
The query syntax is deliberately not-very R-ish as we might want to adopt this into the rust tool in some guise?
This is a big PR, but quite a lot of it is docs, error handling and tests. The docs (in the Rd file) is probably the best place to start