Generate a statistics table

Generate a statistics table with the chosen statistical functions, nested if called with a grouped dataframe.

Usage

desc_table(data, ..., .auto, .labels)

# S3 method for default
desc_table(data, ..., .auto, .labels)

# S3 method for data.frame
desc_table(data, ..., .labels = NULL, .auto = stats_auto)

# S3 method for grouped_df
desc_table(data, ..., .auto = stats_auto, .labels = NULL)

Arguments

data: The dataframe to analyze
...: A list of named statistics to apply to each element of the dataframe, or a function returning a list of named statistics
.auto: A function to automatically determine appropriate statistics
.labels: A named character vector of variable labels

Value

A simple or grouped descriptive table

Stats

The statistical functions to use in the table are passed as additional arguments. If the argument is named (eg. N = length) the name will be used as the column title instead of the function name (here, N instead of length).

Any R function can be a statistical function, as long as it returns only one value when applied to a vector, or as many values as there are levels in a factor, plus one.

Users can also use purrr::map-like formulas as quick anonymous functions (eg. Q1 = ~ quantile(., .25) to get the first quantile in a column named Q1)

If no statistical function is given to desc_table, the .auto argument is used to provide a function that automatically determines the most appropriate statistical functions to use based on the contents of the table.

Labels

.labels is a named character vector to provide "pretty" labels to variables.

If given, the variable names for which there is a label will be replaced by their corresponding label.

Not all variables need to have a label, and labels for non-existing variables are ignored.

labels must be given in the form c(unquoted_variable_name = "label")

Output

The output is either a dataframe in the case of a simple descriptive table, or nested dataframes in the case of a comparative table.

Examples

iris %>%
  desc_table()
#>                   Variables   N        % Min  Q1  Med     Mean  Q3 Max
#> 1              Sepal.Length 150       NA 4.3 5.1 5.80 5.843333 6.4 7.9
#> 2               Sepal.Width 150       NA 2.0 2.8 3.00 3.057333 3.3 4.4
#> 3              Petal.Length 150       NA 1.0 1.6 4.35 3.758000 5.1 6.9
#> 4               Petal.Width 150       NA 0.1 0.3 1.30 1.199333 1.8 2.5
#> 5               **Species** 150       NA  NA  NA   NA       NA  NA  NA
#> 6     **Species**: *setosa*  50 33.33333  NA  NA   NA       NA  NA  NA
#> 7 **Species**: *versicolor*  50 33.33333  NA  NA   NA       NA  NA  NA
#> 8  **Species**: *virginica*  50 33.33333  NA  NA   NA       NA  NA  NA
#>          sd IQR
#> 1 0.8280661 1.3
#> 2 0.4358663 0.5
#> 3 1.7652982 3.5
#> 4 0.7622377 1.5
#> 5        NA  NA
#> 6        NA  NA
#> 7        NA  NA
#> 8        NA  NA

# Does the same as stats_auto here
iris %>%
  desc_table("N"      = length,
             "Min"    = min,
             "Q1"     = ~quantile(., .25),
             "Med"    = median,
             "Mean"   = mean,
             "Q3"     = ~quantile(., .75),
             "Max"    = max,
             "sd"     = sd,
             "IQR"    = IQR)
#>                   Variables   N Min  Q1  Med     Mean  Q3 Max        sd IQR
#> 1              Sepal.Length 150 4.3 5.1 5.80 5.843333 6.4 7.9 0.8280661 1.3
#> 2               Sepal.Width 150 2.0 2.8 3.00 3.057333 3.3 4.4 0.4358663 0.5
#> 3              Petal.Length 150 1.0 1.6 4.35 3.758000 5.1 6.9 1.7652982 3.5
#> 4               Petal.Width 150 0.1 0.3 1.30 1.199333 1.8 2.5 0.7622377 1.5
#> 5               **Species** 150  NA  NA   NA       NA  NA  NA        NA  NA
#> 6     **Species**: *setosa*  50  NA  NA   NA       NA  NA  NA        NA  NA
#> 7 **Species**: *versicolor*  50  NA  NA   NA       NA  NA  NA        NA  NA
#> 8  **Species**: *virginica*  50  NA  NA   NA       NA  NA  NA        NA  NA

# With grouping on a factor
iris %>%
  group_by(Species) %>%
  desc_table(.auto = stats_auto)
#> # A tibble: 3 × 4
#> # Groups:   Species [3]
#>   Species    data              .stats       .vars       
#>   <fct>      <list>            <list>       <list>      
#> 1 setosa     <tibble [50 × 4]> <df [4 × 8]> <df [4 × 1]>
#> 2 versicolor <tibble [50 × 4]> <df [4 × 8]> <df [4 × 1]>
#> 3 virginica  <tibble [50 × 4]> <df [4 × 8]> <df [4 × 1]>