std_invl() standardizes the various representations of numeric intervals
found in the ML in HCT dataset. These intervals are assumed to be in percentage
values and thus lie between 0 and 100. Explicit intervals with upper and lower
bounds, as well as implicit intervals using < and >, are handled (<= and >=
are currently not supported). The return value simplifies to </>/<=/>= or
a single numeric value if possible and uses standard interval notation if not.
Arguments
- x
A
charactervector- less_than
Regex patterns to consider
"<". Passed tostringr::str_replace(). Can be a vector of patterns.- greater_than
Regex patterns to consider
">". Passed tostringr::str_replace(). Can be a vector of patterns.- na
Regex patterns to consider
NA. Passed tostringr::str_detect(). Can be a vector of patterns.- std_chr
Whether to standarize the strings before parsing
- warn
Whether to emit a warning when potential numeric values are not able to be converted to an interval
- ...
Arguments passed on to
chr_to_numstdWhether to standardize the vector before cleaning and converting
convertWhether to actually convert to
numericreplaceA
data.frameof regular expressions and strings to replace them; regular expression should be in a column namedpattern, and replacements should be in a column namedreplacement. Each row is passed tostringr::str_replace().per_actionHow to treat %/percent/per million/etc labels.
dropsimply removes the labels,dividedivides the value by the appropriate denominator, andignoredoes nothing.multiple_decimalsHow to handle multiple decimals within a number
donor_hostWhich value to use when values for both a donor and a host are given