Skip to content

Commit

Permalink
MINOR: [R][Docs] Improve error message around add_filename (#37372)
Browse files Browse the repository at this point in the history
### Rationale for this change

Before this change, it's difficult for the user to not get stuck when they run into an error trying to use the result of add_filename in subsequent pipeline steps.

### What changes are included in this PR?

- Update error message string now includes advice
- Updated docs page for add_filename including an example

### Are these changes tested?

Yes. Tests were updated and confirmed to pass.

### Are there any user-facing changes?

No.

Authored-by: Bryce Mecum <[email protected]>
Signed-off-by: Dewey Dunnington <[email protected]>
  • Loading branch information
amoeba authored Aug 25, 2023
1 parent c079dac commit fa0af70
Show file tree
Hide file tree
Showing 4 changed files with 43 additions and 12 deletions.
24 changes: 19 additions & 5 deletions r/R/dplyr-funcs-augmented.R
Original file line number Diff line number Diff line change
Expand Up @@ -20,13 +20,27 @@
#' This function only exists inside `arrow` `dplyr` queries, and it only is
#' valid when quering on a `FileSystemDataset`.
#'
#' @return A `FieldRef` `Expression` that refers to the filename augmented
#' column.
#' @examples
#' \dontrun{
#' To use filenames generated by this function in subsequent pipeline steps, you
#' must either call \code{\link[dplyr:compute]{compute()}} or
#' \code{\link[dplyr:collect]{collect()}} first. See Examples.
#'
#' @return A `FieldRef` \code{\link{Expression}} that refers to the filename
#' augmented column.
#'
#' @examples \dontrun{
#' open_dataset("nyc-taxi") %>% mutate(
#' file =
#' add_filename()
#' )
#'
#' # To use a verb like mutate() with add_filename() we need to first call
#' # compute()
#' open_dataset("nyc-taxi") %>%
#' mutate(file = add_filename())
#' mutate(file = add_filename()) %>%
#' compute() %>%
#' mutate(filename_length = nchar(file))
#' }
#'
#' @keywords internal
add_filename <- function() Expression$field_ref("__filename")

Expand Down
5 changes: 3 additions & 2 deletions r/R/util.R
Original file line number Diff line number Diff line change
Expand Up @@ -223,8 +223,9 @@ handle_augmented_field_misuse <- function(msg, call) {
msg,
i = paste(
"`add_filename()` or use of the `__filename` augmented field can only",
"be used with with Dataset objects, and can only be added before doing",
"an aggregation or a join."
"be used with Dataset objects, can only be added before doing",
"an aggregation or a join, and cannot be referenced in subsequent",
"pipeline steps until either compute() or collect() is called."
)
)
abort(msg, call = call)
Expand Down
21 changes: 18 additions & 3 deletions r/man/add_filename.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 3 additions & 2 deletions r/tests/testthat/test-dataset.R
Original file line number Diff line number Diff line change
Expand Up @@ -1440,8 +1440,9 @@ test_that("can add in augmented fields", {

error_regex <- paste(
"`add_filename()` or use of the `__filename` augmented field can only",
"be used with with Dataset objects, and can only be added before doing",
"an aggregation or a join."
"be used with Dataset objects, can only be added before doing",
"an aggregation or a join, and cannot be referenced in subsequent",
"pipeline steps until either compute() or collect() is called."
)

# errors appropriately with ArrowTabular objects
Expand Down

0 comments on commit fa0af70

Please sign in to comment.