From 8fd2775eb64c4fe3b495ecac042b3bee9aa4669b Mon Sep 17 00:00:00 2001 From: Elin Waring Date: Sat, 26 Oct 2019 06:38:42 -0400 Subject: [PATCH] Update README.md --- README.md | 343 ++++++++++++++++++------------------ inst/other_docs/blog_v2.Rmd | 2 +- 2 files changed, 177 insertions(+), 168 deletions(-) diff --git a/README.md b/README.md index 6f71a541..b76f2eff 100644 --- a/README.md +++ b/README.md @@ -9,6 +9,15 @@ Status](https://travis-ci.org/ropensci/skimr.svg?branch=master)](https://travis- [![cran checks](https://cranchecks.info/badges/summary/skimr)](https://cranchecks.info/pkgs/skimr) +This is a release candidate for skimr version 2. +------------------------------------------------ + +Warning: There are important differences between this and version 1. +-------------------------------------------------------------------- + +Use caution if updating a version 1 instance used programmatically. +------------------------------------------------------------------- + `skimr` provides a frictionless approach to summary statistics which conforms to the [principle of least surprise](https://en.wikipedia.org/wiki/Principle_of_least_astonishment), @@ -47,7 +56,7 @@ Skim statistics in the console missing, complete, n, and sd. - reports each data types separately - handles dates, logicals, and a variety of other types -- supports spark-bar and spark-line based on [Hadley Wickham’s pillar +- supports spark-bar and spark-line based on [Hadley Wickham's pillar package](https://github.com/hadley/pillar). ### Separates variables by class: @@ -66,11 +75,11 @@ Skim statistics in the console ## ________________________ ## Group variables None ## - ## ── Variable type: factor ─────────────────────────────────────────────────────────────────────────────────────────────── + ## ── Variable type: factor ──────────────────────────────────────────────────────────────── ## skim_variable n_missing complete_rate ordered n_unique top_counts ## 1 feed 0 1 FALSE 6 soy: 14, cas: 12, lin: 12, sun: 12 ## - ## ── Variable type: numeric ────────────────────────────────────────────────────────────────────────────────────────────── + ## ── Variable type: numeric ─────────────────────────────────────────────────────────────── ## skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist ## 1 weight 0 1 261. 78.1 108 204. 258 324. 423 ▆▆▇▇▃ @@ -90,11 +99,11 @@ Skim statistics in the console ## ________________________ ## Group variables None ## - ## ── Variable type: factor ─────────────────────────────────────────────────────────────────────────────────────────────── + ## ── Variable type: factor ──────────────────────────────────────────────────────────────── ## skim_variable n_missing complete_rate ordered n_unique top_counts ## 1 Species 0 1 FALSE 3 set: 50, ver: 50, vir: 50 ## - ## ── Variable type: numeric ────────────────────────────────────────────────────────────────────────────────────────────── + ## ── Variable type: numeric ─────────────────────────────────────────────────────────────── ## skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist ## 1 Sepal.Length 0 1 5.84 0.828 4.3 5.1 5.8 6.4 7.9 ▆▇▇▅▂ ## 2 Sepal.Width 0 1 3.06 0.436 2 2.8 3 3.3 4.4 ▁▆▇▂▁ @@ -118,7 +127,7 @@ Skim statistics in the console ## ________________________ ## Group variables None ## - ## ── Variable type: character ──────────────────────────────────────────────────────────────────────────────────────────── + ## ── Variable type: character ───────────────────────────────────────────────────────────── ## skim_variable n_missing complete_rate min max empty n_unique whitespace ## 1 name 0 1 3 21 0 87 0 ## 2 hair_color 5 0.943 4 13 0 12 0 @@ -128,13 +137,13 @@ Skim statistics in the console ## 6 homeworld 10 0.885 4 14 0 48 0 ## 7 species 5 0.943 3 14 0 37 0 ## - ## ── Variable type: list ───────────────────────────────────────────────────────────────────────────────────────────────── + ## ── Variable type: list ────────────────────────────────────────────────────────────────── ## skim_variable n_missing complete_rate n_unique min_length max_length ## 1 films 0 1 24 1 7 ## 2 vehicles 0 1 11 0 2 ## 3 starships 0 1 17 0 5 ## - ## ── Variable type: numeric ────────────────────────────────────────────────────────────────────────────────────────────── + ## ── Variable type: numeric ─────────────────────────────────────────────────────────────── ## skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist ## 1 height 6 0.931 174. 34.8 66 167 180 191 264 ▁▁▇▅▁ ## 2 mass 28 0.678 97.3 169. 15 55.6 79 84.5 1358 ▇▁▁▁▁ @@ -172,7 +181,7 @@ Skim statistics in the console ## ________________________ ## Group variables None ## - ## ── Variable type: numeric ────────────────────────────────────────────────────────────────────────────────────────────── + ## ── Variable type: numeric ─────────────────────────────────────────────────────────────── ## skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist ## 1 Sepal.Length 0 1 5.84 0.828 4.3 5.1 5.8 6.4 7.9 ▆▇▇▅▂ ## 2 Petal.Length 0 1 3.76 1.77 1 1.6 4.35 5.1 6.9 ▇▁▆▇▂ @@ -197,7 +206,7 @@ Skim statistics in the console ## ________________________ ## Group variables Species ## - ## ── Variable type: numeric ────────────────────────────────────────────────────────────────────────────────────────────── + ## ── Variable type: numeric ─────────────────────────────────────────────────────────────── ## skim_variable Species n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist ## 1 Sepal.Length setosa 0 1 5.01 0.352 4.3 4.8 5 5.2 5.8 ▃▃▇▅▁ ## 2 Sepal.Length versicolor 0 1 5.94 0.516 4.9 5.6 5.9 6.3 7 ▂▇▆▃▃ @@ -229,7 +238,7 @@ Skim statistics in the console ## ________________________ ## Group variables None ## - ## ── Variable type: numeric ────────────────────────────────────────────────────────────────────────────────────────────── + ## ── Variable type: numeric ─────────────────────────────────────────────────────────────── ## skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist ## 1 Petal.Length 0 1 3.76 1.77 1 1.6 4.35 5.1 6.9 ▇▁▆▇▂ @@ -249,36 +258,36 @@ chunk. Data summary -Name -Piped data +Name +Piped data -Number of rows -272 +Number of rows +272 -Number of columns -2 +Number of columns +2 -_______________________ - +_______________________ + -Column type frequency: - +Column type frequency: + -numeric -2 +numeric +2 -________________________ - +________________________ + -Group variables -None +Group variables +None @@ -288,45 +297,45 @@ chunk. - - - - - - - - - - - + + + + + + + + + + + - - - - - - - - - - - + + + + + + + + + + + - - - - - - - - - - - + + + + + + + + + + +
skim_variablen_missingcomplete_ratemeansdp0p25p50p75p100histskim_variablen_missingcomplete_ratemeansdp0p25p50p75p100hist
eruptions013.491.141.62.1644.455.1▇▂▂▇▇eruptions013.491.141.62.1644.455.1▇▂▂▇▇
waiting0170.9013.5943.058.007682.0096.0▃▃▂▇▂waiting0170.9013.5943.058.007682.0096.0▃▃▂▇▂
@@ -361,36 +370,36 @@ skimmers. Data summary -Name -iris +Name +iris -Number of rows -150 +Number of rows +150 -Number of columns -5 +Number of columns +5 -_______________________ - +_______________________ + -Column type frequency: - +Column type frequency: + -numeric -1 +numeric +1 -________________________ - +________________________ + -Group variables -None +Group variables +None @@ -400,41 +409,41 @@ skimmers. - - - - - - - - - - - - + + + + + + + + + + + + - - - - - - - - - - - - + + + + + + + + + + + +
skim_variablen_missingcomplete_ratemeansdp0p25p50p75p100histmadskim_variablen_missingcomplete_ratemeansdp0p25p50p75p100histmad
Sepal.Length015.840.834.35.15.86.47.9▆▇▇▅▂1.04Sepal.Length015.840.834.35.15.86.47.9▆▇▇▅▂1.04
But you can also use the dummy argument pattern from `dplyr::funs` to set particular function arguments. Setting the `append = FALSE` argument -uses only those functions that you’ve provided. +uses only those functions that you've provided. my_skim <- skim_with( numeric = sfl(iqr = IQR, p99 = ~ quantile(., probs = .99)), append = FALSE @@ -445,36 +454,36 @@ uses only those functions that you’ve provided. Data summary -Name -iris +Name +iris -Number of rows -150 +Number of rows +150 -Number of columns -5 +Number of columns +5 -_______________________ - +_______________________ + -Column type frequency: - +Column type frequency: + -numeric -1 +numeric +1 -________________________ - +________________________ + -Group variables -None +Group variables +None @@ -484,20 +493,20 @@ uses only those functions that you’ve provided. - - - - - + + + + + - - - - - + + + + +
skim_variablen_missingcomplete_rateiqrp99skim_variablen_missingcomplete_rateiqrp99
Sepal.Length011.37.7Sepal.Length011.37.7
@@ -511,36 +520,36 @@ And you can default skimmers by setting them to `NULL`. Data summary -Name -iris +Name +iris -Number of rows -150 +Number of rows +150 -Number of columns -5 +Number of columns +5 -_______________________ - +_______________________ + -Column type frequency: - +Column type frequency: + -numeric -1 +numeric +1 -________________________ - +________________________ + -Group variables -None +Group variables +None @@ -550,30 +559,30 @@ And you can default skimmers by setting them to `NULL`. - - - - - - - - - - + + + + + + + + + + - - - - - - - - - - + + + + + + + + + +
skim_variablen_missingcomplete_ratemeansdp0p25p50p75p100skim_variablen_missingcomplete_ratemeansdp0p25p50p75p100
Sepal.Length015.840.834.35.15.86.47.9Sepal.Length015.840.834.35.15.86.47.9
@@ -597,7 +606,7 @@ default: their own default summary functions for data types not covered above. It relies on R S3 methods for the `get_skimmers` function. This function should return a `sfl`, similar to customization within `skim_with()`, -but you should also provide a value for the `class` argument. Here’s an +but you should also provide a value for the `class` argument. Here's an example. get_skimmers.my_data_type <- function(column) { @@ -648,13 +657,13 @@ knit them to a specific document format. The same session that produces a correctly rendered HTML document may produce an incorrectly rendered PDF, for example. This issue can generally be addressed by changing fonts to one with good building block (for histograms) and Braille -support (for line graphs). For example, the open font “DejaVu Sans” from +support (for line graphs). For example, the open font "DejaVu Sans" from the `extrafont` package supports these. You may also want to try wrapping your results in `knitr::kable()`. Please see the vignette on using fonts for details. Displays in documents of different types will vary. For example, one -user found that the font “Yu Gothic UI Semilight” produced consistent +user found that the font "Yu Gothic UI Semilight" produced consistent results for Microsoft Word and Libre Office Write. Contributing @@ -662,7 +671,7 @@ Contributing We welcome issue reports and pull requests, including potentially adding support for commonly used variable classes. However, in general, we -encourage users to take advantage of skimr’s flexibility to add their +encourage users to take advantage of skimr's flexibility to add their own customized classes. Please see the [contributing](CONTRIBUTING.md) and [conduct](CONDUCT.md) documents. diff --git a/inst/other_docs/blog_v2.Rmd b/inst/other_docs/blog_v2.Rmd index 5c7ba439..1247ecb2 100644 --- a/inst/other_docs/blog_v2.Rmd +++ b/inst/other_docs/blog_v2.Rmd @@ -64,7 +64,7 @@ Just kidding! We love our little histograms, even when they don't love us back! For those of you that might have never seen `skimr`, using the package typically boils down to a single function call: -```{r render = knitr::normal_print} +```{r render = knitr::normal_print, message=FALSE} library(skimr) library(dplyr) options(width = 90)