create_table() summarizes a given variable in a one-way table with percentages. It is mostly a wrapper around tabyl that allows more flexibility in ordering the output table. It is designed to handle multiple variables at once using tidyselect helpers and is able to define percentages based on total observations in wide (input) or long (pivoted) form.

create_table(
  .data,
  ...,
  to = NULL,
  infreq = NULL,
  total_wide = TRUE,
  to_na = c("unknown", "missing", "NA", "N/A", "<NA>", "^$"),
  show_missing_levels = FALSE
)

Arguments

.data

A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr)

...

The variable(s) in .data to analyze; can be specified as normal (unquoted) variables, strings, or using tidyselect helpers (such as starts_with)

to

The name of the variable to "pivot" to; this defaults to the longest common prefix in the input variable names, or "value" if none exists

infreq

Should the output be ordered by frequency? The default depends on the input type; see details.

total_wide

Should the total used for percentages come from the number of input observations (wide) or the number of pivoted observations (long)? This only matters when selecting multiple variables with ....

to_na

A character vector of values that should be considered missing, as regular expressions. Case is ignored.

show_missing_levels

Should all levels be shown, even if empty?

Value

The output of tabyl(), modified as above and coerced to a tibble

Details

By default, create_table() will order factor inputs by their level and all other input by frequency. If infreq = TRUE, it will all input by frequency; if infreq = FALSE, it will order all input alpha-numerically. Note that the .by variable will be converted to a factor with levels ordered according to the output table, regardless of input type or ordering.