estimate_delay() estimates the time it takes for a given percentage of samples collected on a certain date to be reported.

estimate_delay(
  .data,
  .collection_date = "collection_date",
  .report_date = "report_date",
  pct = 0.9,
  period = 14L,
  today = Sys.Date(),
  rtn = c("last_complete", "incomplete_only", "all"),
  min_dt = as.Date("2020-04-12"),
  quiet = FALSE
)

Arguments

.data

A data frame containing one incident observation per row

.collection_date

<tidy-select> A Date column to use as the collection date of the observed case

.report_date

<tidy-select> A Date column to use as the report date of the observed case

pct

The quantile to use when computing the delay

period

The number of days to average over for the rolling comparison

today

The date to consider "today"

rtn

What to return. By default, this is a single-row tibble containing the last complete .collection_date; it can also return either incomplete dates only or all dates. All return values are tibbles with the same columns; see Value for details.

min_dt

The minimum date to consider- set to the first reporting date in SCHD data by default

quiet

Should information on observations excluded from the estimation be shown?

Value

A tibble containing one row per date and columns for .collection_date, prior_delay, delay, and incomplete status

Details

To estimate reporting delay, estimate_delay() calculates quantiles of the delay distribution corresponding to pct for each .collection_date in the data. If reporting is complete, these quantiles are interpretable as the time needed for pct samples to be reported from a given date. If reporting is incomplete, these will be biased towards the portion of the delay distribution that is prioritized in the reporting process. In SCHD data, cases have been mostly processed in temporal order, so this bias is upwards (towards longer delays).

Next, quantiles are weighted by the sample size on each date, and a rolling average is calculated with a window equal to period. This is the continuous domain equivalent of calculating the quantile over period days.

Finally, the averages for t-period to t-1 are compared to time between today and each date t. If the average is larger than this time difference, reporting is considered incomplete; otherwise, reporting is considered complete.