estimate_delay.Rd
estimate_delay()
estimates the time it takes for a given percentage of
samples collected on a certain date to be reported.
A data frame containing one incident observation per row
<tidy-select>
A Date
column to use as the
collection date of the observed case
<tidy-select>
A Date
column to use as the report
date of the observed case
The quantile to use when computing the delay
The number of days to average over for the rolling comparison
The date to consider "today"
What to return. By default, this is a single-row tibble
containing the last complete .collection_date
; it can also return either
incomplete dates only or all dates. All return values are tibbles with
the same columns; see Value
for details.
The minimum date to consider- set to the first reporting date in SCHD data by default
Should information on observations excluded from the estimation be shown?
A tibble
containing one row per date and columns for
.collection_date
, prior_delay
, delay
, and incomplete
status
To estimate reporting delay, estimate_delay()
calculates quantiles of the
delay distribution corresponding to pct
for each .collection_date
in the
data. If reporting is complete, these quantiles are interpretable as the
time needed for pct
samples to be reported from a given date. If reporting
is incomplete, these will be biased towards the portion of the delay
distribution that is prioritized in the reporting process. In SCHD data,
cases have been mostly processed in temporal order, so this bias is
upwards (towards longer delays).
Next, quantiles are weighted by the sample size on each date, and a rolling
average is calculated with a window equal to period
. This is the
continuous domain equivalent of calculating the quantile over period
days.
Finally, the averages for t-period
to t-1
are compared to time between
today
and each date t
. If the average is larger than this time
difference, reporting is considered incomplete; otherwise, reporting is
considered complete.