cv_decomposition() applies rolling cross-validation to the portion of an STL decomposition conditional on future data. It provides the predicted values and associated "errors" for each "forecast" horizon and decomposed component across all stably-estimable time points.

cv_decomposition(
  .data,
  .col = "observed_cleaned",
  trend = "31 days",
  period = "7 days"
)

Arguments

.data

A data frame containing the time series data to resample

.col

The column containing the data to resample

trend

The length of time to use in trend decomposition; can be a time-based definition (e.g. "1 month") or an integer number of days. If NULL or "auto", trend is set automatically using the tunable heuristics in the timetk package.

period

The length of time to use in seasonal decomposition; can be a time-based definition (e.g. "1 week") or an integer number of days. If NULL or "auto", period is set automatically using the tunable heuristics in the timetk package.

Value

A list of tibble objects, each containing the results of one sampling step for the dates in start_date + trend to end_date - trend/2, where end_date is the last completely observed date. See the Value section of validate_decomposition()for information on the components of each sample.

Details

When smoothing time series using LOESS, a portion of the smooth will change as future data comes in (approximately the last trend/2 data points). These points are said to be conditional on future data, and the difference between the smooth with and without future data can be computed for all time points that have already reached the "stable" portion of the series. This is done by rolling the decomposition step across the time series and comparing the smoothed point at each "forecast" horizon to the stable estimates. In effect, this is rolling cross-validation of the STL decomposition using the fully smoothed data as the reference values.

See also

The workhorse function validate_decomposition() and the higher-level function cv_linelist_decomposition()