Skip to content

Incorrect fitting results when n_fits is very large #136

@DDAWX

Description

@DDAWX

Hi, I'm encountering an issue when running gpufit with a large number of fits.

When n_fits = 1,000,000 or 1,700,000, the fitting results become incorrect. However, with n_fits = 1,600,000, the results are correct. It seems that the output depends in a non-linear way on the number of fits, which may indicate a memory-related bug or overflow.

Below is a minimal example to reproduce the issue:

##########################################
"""
import numpy as np
import pygpufit.gpufit as gf
print('CUDA available: {}'.format(gf.cuda_available()))
print('CUDA versions runtime: {}, driver: {}'.format(*gf.get_cuda_version()))

def test(n_fits):
params = np.random.rand(n_fits, 2).astype(np.float32)

x = np.random.rand(n_fits, 300).astype(np.float32) * 100
y = params[:, :1] + params[:, 1:] * x

init_params = np.array([0.1, 0.1], dtype=np.float32)
init_params = np.tile(init_params, (n_fits, 1))

results = gf.fit(data=y, 
                weights=None, 
                model_id=gf.ModelID.LINEAR_1D, 
                initial_parameters=init_params,
                tolerance = 1e-8,
                user_info=x)
print("n_fits=", n_fits, "true:", params[-1], "preds:", results[0][-1])

test(1000000)
test(1600000)
test(1700000)
"""

#########################################
Output:
n_fits= 1000000 true: [0.73556924 0.9035081 ] preds: [ 4.8269367e+01 -8.0045573e-03]
n_fits= 1600000 true: [0.08675532 0.55433005] preds: [0.086756 0.55433005]
n_fits= 1700000 true: [0.61472124 0.9756066 ] preds: [54.71934 -0.07528704]

As shown, for n_fits = 1,000,000 and 1,700,000, the results are clearly incorrect, while 1,600,000 gives the expected values. The model being used is LINEAR_1D, which is normally very stable, so this behavior is unexpected.

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions