Skip to content

Index error in MGCX #409

Open
Open
@Rayerdyne

Description

My issue is about an IndexError that appears using MGCX.test(). This error is originally thrown by scipy multiscale_graphcorr (cfr stacktrace).

I'm very surprised this depends on the random number generation, i.e. it fails for some seeds but not all of them. Increasing the number of replications (reps) seems to increase the probability that an error occurs. Setting reps=1000 makes seed 16 fail as well.
The former actually makes me think I messed up somewhere, but I can't get where

Reproducing code example:

import sys

import pandas as pd
import numpy as np

from hyppo.time_series import MGCX

def test(seed):
    print(f"Testing seed {seed}")
    reps=100

    df = pd.DataFrame([[1, 1],
                       [2, 1],
                       [3, 1],
                       [4, 4],
                       [5, 5],
                       [6, 6]], columns=["a", "b"])

    i_test = MGCX()
    rstate = np.random.RandomState(seed)

    stat, pval, d = i_test.test(df["a"].values, df["b"].values, random_state=rstate, reps=reps)
    print(f"stat: {stat}, pval: {pval}, d: {d}")

if __name__ == "__main__":
    if len(sys.argv) > 1:
        seed = int(sys.argv[1])
        test(seed)
    
    else:
        test(16)
        test(0)

Error message

Testing seed 16
stat: 0.886004262777708, pval: 0.0297029702970297, d: {'opt_lag': 0, 'opt_scale': [6, 4]}
Testing seed 0
Traceback (most recent call last):
  File "/home/f/TRAVAIL/csod/misc/hyppo/problem.py", line 32, in <module>
AIL/csod/misc/hyppo/problem.py", line 22, in test
    stat, pval, d = i_test.test(df["a"].values, df["b"].values, random_state=rstate, reps=reps)
  File "/home/f/TRAVAIL/csod/misc/hyppo/.env/lib/python3.10/site-packages/hyppo/time_series/mgcx.py", line 194, in test
    stat, pvalue, stat_list = super(MGCX, self).test(
  File "/home/f/TRAVAIL/csod/misc/hyppo/.env/lib/python3.10/site-packages/hyppo/time_series/base.py", line 130, in test
    Parallel(n_jobs=workers)(
  File "/home/f/TRAVAIL/csod/misc/hyppo/.env/lib/python3.10/site-packages/joblib/parallel.py", line 1863, in __call__
    return output if self.return_generator else list(output)
  File "/home/f/TRAVAIL/csod/misc/hyppo/.env/lib/python3.10/site-packages/joblib/parallel.py", line 1792, in _get_sequential_output
    res = func(*args, **kwargs)
  File "/home/f/TRAVAIL/csod/misc/hyppo/.env/lib/python3.10/site-packages/hyppo/time_series/base.py", line 159, in _perm_stat
    perm_stat = calc_stat(distx, permy)[0]
  File "/home/f/TRAVAIL/csod/misc/hyppo/.env/lib/python3.10/site-packages/hyppo/time_series/mgcx.py", line 106, in statistic
    stat, opt_lag = compute_stat(
  File "/home/f/TRAVAIL/csod/misc/hyppo/.env/lib/python3.10/site-packages/hyppo/time_series/_utils.py", line 93, in compute_stat
    indep_test_stat = indep_test.statistic(x, y)
  File "/home/f/TRAVAIL/csod/misc/hyppo/.env/lib/python3.10/site-packages/hyppo/independence/mgc.py", line 161, in statistic
    mgc = multiscale_graphcorr(distx, disty, compute_distance=None, reps=0)
  File "/home/f/TRAVAIL/csod/misc/hyppo/.env/lib/python3.10/site-packages/scipy/stats/_stats_py.py", line 6490, in multiscale_graphcorr
    stat, stat_dict = _mgc_stat(x, y)
  File "/home/f/TRAVAIL/csod/misc/hyppo/.env/lib/python3.10/site-packages/scipy/stats/_stats_py.py", line 6541, in _mgc_stat
    stat = stat_mgc_map[m - 1][n - 1]
IndexError: index 5 is out of bounds for axis 0 with size 1

Version information

  • OS: Arch Linux 6.6.7-arch1-1 (64-bit)
  • Python Version 3.10
  • Package Version hyppo==0.4.0, sci-py==1.11.4, joblib==1.3.2

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions