Skip to content

A Python module for time series cross-validation using Combinatorial Purged Cross-Validation (CPCV) with embargo to prevent data leakage.

License

Notifications You must be signed in to change notification settings

Yosri-Ben-Halima/cpcv-train-test-data-split-module

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Combinatorial Purged Cross-Validation with Embargo for Time Series Data

CPCV with Embargo prevents leakage in time series cross-validation by purging overlapping periods and applying an embargo around test folds.

Installation

pip install git+https://github.com/yosri-bh/cpcv-train-test-data-split-module.git

or

pip install cpcv

Usage

import pandas as pd
from cpcv import CPCV

df = pd.DataFrame({'feature': range(100)})
cpcv = CPCV(n_folds=5, test_size=1, embargo_pct=0.1)
splits = cpcv.split(df)

for train, test in splits:
    print(train.shape, test.shape)

Connect with Me

Thank you for visiting my GitHub profile! Feel free to reach out if you have any questions or opportunities to collaborate. Let's connect and explore new possibilities together:

GitHub LinkedIn Facebook Instagram Email Personal Web Page Google Drive PyPI

About

A Python module for time series cross-validation using Combinatorial Purged Cross-Validation (CPCV) with embargo to prevent data leakage.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages