Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allowing .diagonal argument to vary in colpair_map #176

Open
luifrancgom opened this issue Feb 27, 2024 · 0 comments
Open

Allowing .diagonal argument to vary in colpair_map #176

luifrancgom opened this issue Feb 27, 2024 · 0 comments

Comments

@luifrancgom
Copy link

luifrancgom commented Feb 27, 2024

I know colpair_map is a solution in relation to #42. In the case of colpair_map we have the following for cor:

library(corrr)
colpair_map(.data = mtcars, .f = cor, .diagonal = NA)
#> # A tibble: 11 × 12
#>    term     mpg    cyl   disp     hp    drat     wt    qsec     vs      am
#>    <chr>  <dbl>  <dbl>  <dbl>  <dbl>   <dbl>  <dbl>   <dbl>  <dbl>   <dbl>
#>  1 mpg   NA     -0.852 -0.848 -0.776  0.681  -0.868  0.419   0.664  0.600 
#>  2 cyl   -0.852 NA      0.902  0.832 -0.700   0.782 -0.591  -0.811 -0.523 
#>  3 disp  -0.848  0.902 NA      0.791 -0.710   0.888 -0.434  -0.710 -0.591 
#>  4 hp    -0.776  0.832  0.791 NA     -0.449   0.659 -0.708  -0.723 -0.243 
#>  5 drat   0.681 -0.700 -0.710 -0.449 NA      -0.712  0.0912  0.440  0.713 
#>  6 wt    -0.868  0.782  0.888  0.659 -0.712  NA     -0.175  -0.555 -0.692 
#>  7 qsec   0.419 -0.591 -0.434 -0.708  0.0912 -0.175 NA       0.745 -0.230 
#>  8 vs     0.664 -0.811 -0.710 -0.723  0.440  -0.555  0.745  NA      0.168 
#>  9 am     0.600 -0.523 -0.591 -0.243  0.713  -0.692 -0.230   0.168 NA     
#> 10 gear   0.480 -0.493 -0.556 -0.126  0.700  -0.583 -0.213   0.206  0.794 
#> 11 carb  -0.551  0.527  0.395  0.750 -0.0908  0.428 -0.656  -0.570  0.0575
#> # ℹ 2 more variables: gear <dbl>, carb <dbl>
colpair_map(.data = mtcars, .f = cor, .diagonal = 1)
#> # A tibble: 11 × 12
#>    term     mpg    cyl   disp     hp    drat     wt    qsec     vs      am
#>    <chr>  <dbl>  <dbl>  <dbl>  <dbl>   <dbl>  <dbl>   <dbl>  <dbl>   <dbl>
#>  1 mpg    1     -0.852 -0.848 -0.776  0.681  -0.868  0.419   0.664  0.600 
#>  2 cyl   -0.852  1      0.902  0.832 -0.700   0.782 -0.591  -0.811 -0.523 
#>  3 disp  -0.848  0.902  1      0.791 -0.710   0.888 -0.434  -0.710 -0.591 
#>  4 hp    -0.776  0.832  0.791  1     -0.449   0.659 -0.708  -0.723 -0.243 
#>  5 drat   0.681 -0.700 -0.710 -0.449  1      -0.712  0.0912  0.440  0.713 
#>  6 wt    -0.868  0.782  0.888  0.659 -0.712   1     -0.175  -0.555 -0.692 
#>  7 qsec   0.419 -0.591 -0.434 -0.708  0.0912 -0.175  1       0.745 -0.230 
#>  8 vs     0.664 -0.811 -0.710 -0.723  0.440  -0.555  0.745   1      0.168 
#>  9 am     0.600 -0.523 -0.591 -0.243  0.713  -0.692 -0.230   0.168  1     
#> 10 gear   0.480 -0.493 -0.556 -0.126  0.700  -0.583 -0.213   0.206  0.794 
#> 11 carb  -0.551  0.527  0.395  0.750 -0.0908  0.428 -0.656  -0.570  0.0575
#> # ℹ 2 more variables: gear <dbl>, carb <dbl>

Created on 2024-02-27 with reprex v2.1.0

While this approach makes sense for the sample correlation, it's not suitable for the sample covariance (cov).

library(corrr)
colpair_map(.data = mtcars, .f = cov, .diagonal = NA)
#> # A tibble: 11 × 12
#>    term      mpg     cyl   disp      hp     drat      wt     qsec       vs
#>    <chr>   <dbl>   <dbl>  <dbl>   <dbl>    <dbl>   <dbl>    <dbl>    <dbl>
#>  1 mpg     NA     -9.17  -633.  -321.     2.20    -5.12    4.51     2.02  
#>  2 cyl     -9.17  NA      200.   102.    -0.668    1.37   -1.89    -0.730 
#>  3 disp  -633.   200.      NA   6721.   -47.1    108.    -96.1    -44.4   
#>  4 hp    -321.   102.    6721.    NA    -16.5     44.2   -86.8    -25.0   
#>  5 drat     2.20  -0.668  -47.1  -16.5   NA       -0.373   0.0871   0.119 
#>  6 wt      -5.12   1.37   108.    44.2   -0.373   NA      -0.305   -0.274 
#>  7 qsec     4.51  -1.89   -96.1  -86.8    0.0871  -0.305  NA        0.671 
#>  8 vs       2.02  -0.730  -44.4  -25.0    0.119   -0.274   0.671   NA     
#>  9 am       1.80  -0.466  -36.6   -8.32   0.190   -0.338  -0.205    0.0423
#> 10 gear     2.14  -0.649  -50.8   -6.36   0.276   -0.421  -0.280    0.0766
#> 11 carb    -5.36   1.52    79.1   83.0   -0.0784   0.676  -1.89    -0.464 
#> # ℹ 3 more variables: am <dbl>, gear <dbl>, carb <dbl>

Created on 2024-02-27 with reprex v2.1.0

Having NA or a constant value on the diagonal is not ideal because the desired information here is the sample variance, which is specific to each variable. Instead, users expect a complete sample covariance matrix, similar to the output obtained using:

cov(mtcars)
#>              mpg         cyl        disp          hp         drat          wt
#> mpg    36.324103  -9.1723790  -633.09721 -320.732056   2.19506351  -5.1166847
#> cyl    -9.172379   3.1895161   199.66028  101.931452  -0.66836694   1.3673710
#> disp -633.097208 199.6602823 15360.79983 6721.158669 -47.06401915 107.6842040
#> hp   -320.732056 101.9314516  6721.15867 4700.866935 -16.45110887  44.1926613
#> drat    2.195064  -0.6683669   -47.06402  -16.451109   0.28588135  -0.3727207
#> wt     -5.116685   1.3673710   107.68420   44.192661  -0.37272073   0.9573790
#> qsec    4.509149  -1.8868548   -96.05168  -86.770081   0.08714073  -0.3054816
#> vs      2.017137  -0.7298387   -44.37762  -24.987903   0.11864919  -0.2736613
#> am      1.803931  -0.4657258   -36.56401   -8.320565   0.19015121  -0.3381048
#> gear    2.135685  -0.6491935   -50.80262   -6.358871   0.27598790  -0.4210806
#> carb   -5.363105   1.5201613    79.06875   83.036290  -0.07840726   0.6757903
#>              qsec           vs           am        gear        carb
#> mpg    4.50914919   2.01713710   1.80393145   2.1356855 -5.36310484
#> cyl   -1.88685484  -0.72983871  -0.46572581  -0.6491935  1.52016129
#> disp -96.05168145 -44.37762097 -36.56401210 -50.8026210 79.06875000
#> hp   -86.77008065 -24.98790323  -8.32056452  -6.3588710 83.03629032
#> drat   0.08714073   0.11864919   0.19015121   0.2759879 -0.07840726
#> wt    -0.30548161  -0.27366129  -0.33810484  -0.4210806  0.67579032
#> qsec   3.19316613   0.67056452  -0.20495968  -0.2804032 -1.89411290
#> vs     0.67056452   0.25403226   0.04233871   0.0766129 -0.46370968
#> am    -0.20495968   0.04233871   0.24899194   0.2923387  0.04637097
#> gear  -0.28040323   0.07661290   0.29233871   0.5443548  0.32661290
#> carb  -1.89411290  -0.46370968   0.04637097   0.3266129  2.60887097

Created on 2024-02-27 with reprex v2.1.0

It will be possible to display the diagonal elements of the covariance matrix by adding an option to the .diagonal argument?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant