Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when running step_bsmote with a single predictor #151

Open
koenniem opened this issue Aug 14, 2024 · 1 comment
Open

Error when running step_bsmote with a single predictor #151

koenniem opened this issue Aug 14, 2024 · 1 comment
Labels
bug an unexpected problem or unintended behavior

Comments

@koenniem
Copy link

The problem

When running step_bsmote() with only a single predictor, the function throws an error that a matrix cannot be created. This is due to themis:::bsmote_impl() at line 19: the data argument for smote_data() is given by subsetting data_mat with the values of min_class_in, but due to how the pesky subset [ works, the matrix is simplified to a vector in the case of only a single column. Thus, running step_bsmote() with a single predictor always throws this error.

The fix is by specifying drop = FALSE when subsetting data_mat, so that line 19 becomes:

tmp_df <- as.data.frame(smote_data(data = data_mat[min_class_in, , drop = FALSE], k = k, n_samples = samples_needed[i], smote_ids = which(danger_ids[min_class_in])))

Reproducible example

library(tidymodels)
library(themis)

recipe(class ~ compounds, data = hpc_data) |> 
  step_bsmote(all_outcomes(), all_neighbors = FALSE) |> 
  prep() |> 
  bake(NULL)
#> Error in `step_bsmote()`:
#> Caused by error in `matrix()`:
#> ! non-numeric matrix extent

Created on 2024-08-14 with reprex v2.1.1

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.4.1 (2024-06-14 ucrt)
#>  os       Windows 10 x64 (build 19045)
#>  system   x86_64, mingw32
#>  ui       RTerm
#>  language (EN)
#>  collate  English_United Kingdom.utf8
#>  ctype    English_United Kingdom.utf8
#>  tz       Europe/Brussels
#>  date     2024-08-14
#>  pandoc   3.1.11 @ C:/Workdir/MyApps/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package      * version    date (UTC) lib source
#>  backports      1.5.0      2024-05-23 [1] CRAN (R 4.4.0)
#>  broom        * 1.0.6      2024-05-17 [1] CRAN (R 4.4.0)
#>  class          7.3-22     2023-05-03 [1] CRAN (R 4.3.0)
#>  cli            3.6.3      2024-06-21 [1] CRAN (R 4.4.1)
#>  codetools      0.2-20     2024-03-31 [1] CRAN (R 4.3.3)
#>  colorspace     2.1-0      2023-01-23 [1] CRAN (R 4.2.2)
#>  data.table     1.15.4     2024-03-30 [1] CRAN (R 4.3.3)
#>  dials        * 1.2.1      2024-02-22 [1] CRAN (R 4.3.2)
#>  DiceDesign     1.10       2023-12-07 [1] CRAN (R 4.3.2)
#>  digest         0.6.36     2024-06-23 [1] CRAN (R 4.4.1)
#>  dplyr        * 1.1.4      2023-11-17 [1] CRAN (R 4.3.2)
#>  evaluate       0.24.0     2024-06-10 [1] CRAN (R 4.4.1)
#>  fansi          1.0.6      2023-12-08 [1] CRAN (R 4.3.2)
#>  fastmap        1.2.0      2024-05-15 [1] CRAN (R 4.4.0)
#>  foreach        1.5.2      2022-02-02 [1] CRAN (R 4.1.3)
#>  fs             1.6.4      2024-04-25 [1] CRAN (R 4.4.0)
#>  furrr          0.3.1      2022-08-15 [1] CRAN (R 4.2.1)
#>  future         1.33.2     2024-03-26 [1] CRAN (R 4.3.3)
#>  future.apply   1.11.2     2024-03-28 [1] CRAN (R 4.3.3)
#>  generics       0.1.3      2022-07-05 [1] CRAN (R 4.2.1)
#>  ggplot2      * 3.5.1      2024-04-23 [1] CRAN (R 4.3.3)
#>  globals        0.16.3     2024-03-08 [1] CRAN (R 4.3.3)
#>  glue           1.7.0      2024-01-09 [1] CRAN (R 4.3.2)
#>  gower          1.0.1      2022-12-22 [1] CRAN (R 4.2.2)
#>  GPfit          1.0-8      2019-02-08 [1] CRAN (R 4.0.0)
#>  gtable         0.3.5      2024-04-22 [1] CRAN (R 4.3.3)
#>  hardhat        1.4.0      2024-06-02 [1] CRAN (R 4.4.1)
#>  htmltools      0.5.8.1    2024-04-04 [1] CRAN (R 4.3.3)
#>  infer        * 1.0.7      2024-03-25 [1] CRAN (R 4.3.3)
#>  ipred          0.9-15     2024-07-18 [1] CRAN (R 4.4.0)
#>  iterators      1.0.14     2022-02-05 [1] CRAN (R 4.1.3)
#>  knitr          1.48       2024-07-07 [1] CRAN (R 4.4.1)
#>  lattice        0.22-6     2024-03-20 [1] CRAN (R 4.3.3)
#>  lava           1.8.0      2024-03-05 [1] CRAN (R 4.3.3)
#>  lhs            1.2.0      2024-06-30 [1] CRAN (R 4.4.1)
#>  lifecycle      1.0.4      2023-11-07 [1] CRAN (R 4.3.1)
#>  listenv        0.9.1      2024-01-29 [1] CRAN (R 4.3.2)
#>  lubridate      1.9.3      2023-09-27 [1] CRAN (R 4.3.2)
#>  magrittr       2.0.3      2022-03-30 [1] CRAN (R 4.1.3)
#>  MASS           7.3-61     2024-06-13 [1] CRAN (R 4.4.1)
#>  Matrix         1.7-0      2024-03-22 [1] CRAN (R 4.4.0)
#>  modeldata    * 1.4.0      2024-06-19 [1] CRAN (R 4.4.1)
#>  munsell        0.5.1      2024-04-01 [1] CRAN (R 4.3.3)
#>  nnet           7.3-19     2023-05-03 [1] CRAN (R 4.3.0)
#>  parallelly     1.37.1     2024-02-29 [1] CRAN (R 4.3.2)
#>  parsnip      * 1.2.1      2024-03-22 [1] CRAN (R 4.3.3)
#>  pillar         1.9.0      2023-03-22 [1] CRAN (R 4.2.3)
#>  pkgconfig      2.0.3      2019-09-22 [1] CRAN (R 4.0.0)
#>  prodlim        2024.06.25 2024-06-24 [1] CRAN (R 4.4.1)
#>  purrr        * 1.0.2      2023-08-10 [1] CRAN (R 4.3.1)
#>  R6             2.5.1      2021-08-19 [1] CRAN (R 4.1.1)
#>  RANN           2.6.1      2019-01-08 [1] CRAN (R 4.0.0)
#>  Rcpp           1.0.13     2024-07-17 [1] CRAN (R 4.4.0)
#>  recipes      * 1.1.0      2024-07-04 [1] CRAN (R 4.4.1)
#>  reprex         2.1.1      2024-07-06 [1] CRAN (R 4.4.1)
#>  rlang          1.1.4      2024-06-04 [1] CRAN (R 4.4.1)
#>  rmarkdown      2.27       2024-05-17 [1] CRAN (R 4.4.0)
#>  ROSE           0.0-4      2021-06-14 [1] CRAN (R 4.3.3)
#>  rpart          4.1.23     2023-12-05 [1] CRAN (R 4.3.2)
#>  rsample      * 1.2.1      2024-03-25 [1] CRAN (R 4.3.3)
#>  rstudioapi     0.16.0     2024-03-24 [1] CRAN (R 4.3.3)
#>  scales       * 1.3.0      2023-11-28 [1] CRAN (R 4.3.2)
#>  sessioninfo    1.2.2      2021-12-06 [1] CRAN (R 4.1.2)
#>  survival       3.7-0      2024-06-05 [1] CRAN (R 4.4.1)
#>  themis       * 1.0.2      2023-08-14 [1] CRAN (R 4.3.3)
#>  tibble       * 3.2.1      2023-03-20 [1] CRAN (R 4.2.3)
#>  tidymodels   * 1.2.0      2024-03-25 [1] CRAN (R 4.3.3)
#>  tidyr        * 1.3.1      2024-01-24 [1] CRAN (R 4.3.2)
#>  tidyselect     1.2.1      2024-03-11 [1] CRAN (R 4.3.3)
#>  timechange     0.3.0      2024-01-18 [1] CRAN (R 4.3.2)
#>  timeDate       4032.109   2023-12-14 [1] CRAN (R 4.3.2)
#>  tune         * 1.2.1      2024-04-18 [1] CRAN (R 4.3.3)
#>  utf8           1.2.4      2023-10-22 [1] CRAN (R 4.3.2)
#>  vctrs          0.6.5      2023-12-01 [1] CRAN (R 4.3.2)
#>  withr          3.0.0      2024-01-16 [1] CRAN (R 4.3.2)
#>  workflows    * 1.1.4      2024-02-19 [1] CRAN (R 4.3.2)
#>  workflowsets * 1.1.0      2024-03-21 [1] CRAN (R 4.3.3)
#>  xfun           0.46       2024-07-18 [1] CRAN (R 4.4.0)
#>  yaml           2.3.9      2024-07-05 [1] CRAN (R 4.4.1)
#>  yardstick    * 1.3.1      2024-03-21 [1] CRAN (R 4.3.3)
#> 
#>  [1] C:/Workdir/MyApps/R-Library/4.0
#>  [2] C:/Workdir/MyApps/R/R-4.4.1/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────
@EmilHvitfeldt EmilHvitfeldt added the bug an unexpected problem or unintended behavior label Aug 15, 2024
@EmilHvitfeldt
Copy link
Member

Thank you for reporting!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

2 participants