Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix by_group processing for factors. #135

Merged
merged 7 commits into from
Feb 23, 2024
Merged

Fix by_group processing for factors. #135

merged 7 commits into from
Feb 23, 2024

Conversation

topepo
Copy link
Member

@topepo topepo commented Feb 21, 2024

Closes #127

purrr::transpose() was converting factors to integers.

@topepo topepo marked this pull request as ready for review February 22, 2024 15:16
@topepo topepo removed the request for review from edgararuiz February 22, 2024 18:37
@simonpcouch
Copy link
Contributor

Previously, split_dplyr_groups() was able to handle multiple grouping variables:

library(modeldata)
library(dplyr)

penguins_grouped2 <- penguins %>% group_by(species, island)

probably:::split_dplyr_groups(penguins_grouped2)
#> [[1]]
#> [[1]]$data
#> # A tibble: 0 × 7
#> # ℹ 7 variables: species <fct>, island <fct>, bill_length_mm <dbl>,
#> #   bill_depth_mm <dbl>, flipper_length_mm <int>, body_mass_g <int>, sex <fct>
#> 
#> [[1]]$filter
#> species == 1L & island == 1L
#> 
#> [[1]]$rows
#> [1] 0
#> 
#> 
#> [[2]]
#> [[2]]$data
#> # A tibble: 0 × 7
#> # ℹ 7 variables: species <fct>, island <fct>, bill_length_mm <dbl>,
#> #   bill_depth_mm <dbl>, flipper_length_mm <int>, body_mass_g <int>, sex <fct>
#> 
#> [[2]]$filter
#> species == 1L & island == 2L
#> 
#> [[2]]$rows
#> [1] 0
#> 
#> 
#> [[3]]
#> [[3]]$data
#> # A tibble: 0 × 7
#> # ℹ 7 variables: species <fct>, island <fct>, bill_length_mm <dbl>,
#> #   bill_depth_mm <dbl>, flipper_length_mm <int>, body_mass_g <int>, sex <fct>
#> 
#> [[3]]$filter
#> species == 1L & island == 3L
#> 
#> [[3]]$rows
#> [1] 0
#> 
#> 
#> [[4]]
#> [[4]]$data
#> # A tibble: 0 × 7
#> # ℹ 7 variables: species <fct>, island <fct>, bill_length_mm <dbl>,
#> #   bill_depth_mm <dbl>, flipper_length_mm <int>, body_mass_g <int>, sex <fct>
#> 
#> [[4]]$filter
#> species == 2L & island == 2L
#> 
#> [[4]]$rows
#> [1] 0
#> 
#> 
#> [[5]]
#> [[5]]$data
#> # A tibble: 0 × 7
#> # ℹ 7 variables: species <fct>, island <fct>, bill_length_mm <dbl>,
#> #   bill_depth_mm <dbl>, flipper_length_mm <int>, body_mass_g <int>, sex <fct>
#> 
#> [[5]]$filter
#> species == 3L & island == 1L
#> 
#> [[5]]$rows
#> [1] 0

Created on 2024-02-22 with reprex v2.1.0

With this PR, it cannot:

library(modeldata)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

penguins_grouped2 <- penguins %>% group_by(species, island)

probably:::split_dplyr_groups(penguins_grouped2)
#> Error in `rlang::sym()`:
#> ! Can't convert a character vector to a symbol.

Created on 2024-02-22 with reprex v2.1.0

It doesn't like any tests fail as a result of this, but the comments on the function and the reduce() call seem to indicate it ought to, so I'll assume it should.

R/cal-estimate-utils.R Outdated Show resolved Hide resolved
R/cal-estimate-utils.R Show resolved Hide resolved
R/cal-estimate-utils.R Outdated Show resolved Hide resolved
R/cal-estimate-utils.R Show resolved Hide resolved
@topepo
Copy link
Member Author

topepo commented Feb 22, 2024

Thanks. We'll need to put some better error checking in. It is designed to support a single grouping variable. The docs for .by has

The column identifier for the grouping variable. This should be a single unquoted column name that selects a qualitative variable for grouping. Default to NULL. When .by = NULL no grouping will take place.

@topepo topepo merged commit a5bb2c4 into main Feb 23, 2024
8 checks passed
@topepo topepo deleted the factor-groups-127 branch February 23, 2024 01:08
jrwinget added a commit to jrwinget/probably that referenced this pull request Aug 15, 2024
This commit removes the unused helper function create_filter_expr from
probably/R/cal-estimate-utils.R. The function became obsolete after a
partial revert in tidymodels#135.

Fixes tidymodels#139
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cal_estimate_*() with factor variable passed to .by fails
2 participants