forked from clbustos/distribution
-
Notifications
You must be signed in to change notification settings - Fork 21
Open
Labels
Description
Pearson's chi squared test is a more reliable method of ascertaining whether a sequence of numbers belongs to a distribution or follows a patterns. It is easy to fool the test for correct mean and variance with dummy values inserted to adjust it to fit any distribution.
However we should not ignore that mean and variance must be reproduced correctly, the suggestion here is that Pearson's chi squared test be used to refactor test cases into the following structure:
- should pass chi squared test for a specific distribution, maybe call a test helper like (E.g. for testing the uniform distribution):
pearson_chi_squared(candidate: Distribution::Uniform.rng(0.1, 1), target: :uniform, samples: 1000)and returns the significance level of the test as a double.
- should return correct metadata and moments of the distribution, say a function to simulate the distribution for a specified confidence or sample size
metadata_for(candidate: Distribution::Normal.rng(0.1, 1), target: :normal, confidence: 0.99, samples: 100)returns{mean: 0.1, variance: 0.96, skewness: 0.15 ... }- Alternatively the returnee can just be an array where the entry
iis momentiof the sequence
Let me know what you think about this. Right now I feel a lot of test cases are repeated. This issue would of-course require that all the methods in README.md are already implemented so as to compare stuff.