Skip to content

folds_rolling_origin on multivariate data #60

@Shafi2016

Description

@Shafi2016

Hello @jeremyrcoyle and @nhejazi, Can we apply folds_rolling_origin on multivariate data where we have one target and a few independent variables? I have given below reproducible example but I am getting the error, Error in seq.int(first_window, last_window, by = batch) :
'to' must be of length 1

library(origami)
library(tidymodels)

# Core Packages
library(tidyverse)
library(lubridate)
library(timetk)
df1 <- Quandl(code = "FRED/PINCOME",
              type = "raw",
              collapse = "monthly",
              order = "asc",
              end_date="2017-12-31")
df2 <- Quandl(code = "FRED/GDP",
              type = "raw",
              collapse = "monthly",
              order = "asc",
              end_date="2017-12-31")

per <- df1 %>% rename(PI = Value)%>% select(-Date)
gdp <- df2 %>% rename(GDP = Value) 

data <- cbind(gdp,per)

data1 <- tk_augment_differences(
  .data = data,
  .value = GDP:PI,
  .lags = 1,
  .differences = 1,
  .log = TRUE,
  .names = "auto") %>%
  select(-GDP,-PI) %>%
  
  rename(GDP = GDP_lag1_diff1,PI = PI_lag1_diff1) %>% 
  drop_na()

horizon    <- 15
lag_period <- 15

data_pre_full <- data1 %>%
  # Add future window----
bind_rows(
  future_frame(.data = .,.date_var = Date, .length_out = horizon)
) %>%      
  
  # add lags----
tk_augment_lags(
  .value =  GDP : PI   , 
  .lags = lag_period) 
   

data_prepared_tbl <-   data_pre_full %>%
  
  filter(!is.na(GDP)) %>% 
  dplyr::select(-GDP : -PI)  %>%  
  drop_na()

folds <- folds_rolling_origin(
  data_prepared_tbl,
  first_window = 50, validation_size = 1, gap = 0, batch = 10
)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions