PhD-level statistical analysis engine for structured and semi-structured data, focused purely on statistics, probability, and decision science.
Stat Analyzer is a PhD-level statistical analysis and decision science engine designed to work with structured and semi-structured data.
It focuses purely on Statistics and Probability, delivering deep reasoning, diagnostics, and decision-oriented insights.
🚫 No machine learning model performance
✅ 100% statistics, inference, probability, and decision science
Stat Analyzer is built to behave like a PhD-educated statistician that can:
- Understand messy real-world data
- Decide which statistical techniques apply
- Explain why they apply
- Interpret results correctly
- Translate statistics into business and research decisions
- Mean, Median, Mode
- Variance, Standard Deviation
- Range, IQR
- Skewness, Kurtosis
- Frequency distributions
- Probability rules
- Conditional probability
- Bayes theorem
- Permutations & combinations
- Certainty vs uncertainty reasoning
- Normal
- Binomial
- Poisson
- Uniform
- Exponential
- Chi-square
- Student’s t
- F distribution
- Central Limit Theorem
- Confidence intervals
- Hypothesis testing
- P-values
- Type I & Type II errors
- Z-test
- One-sample & two-sample t-tests
- Paired t-test
- ANOVA
- F-test
- Chi-square test
- Non-parametric alternatives
- Correlation analysis
- Simple & multiple linear regression
- Ordinary Least Squares (OLS)
- Residual diagnostics
- Omnibus test
- Assumption validation
- Mean Error
- Mean Absolute Error
- Mean Percentage Error
- Absolute & percentage errors
- Error-based decision guidance
- Descriptive analysis
- Diagnostic analysis
- Predictive analysis (statistics-only)
- Prescriptive analysis
- CSV files
- Excel files
- Semi-structured text tables
- Mixed data types (int, float, categorical, missing values, outliers)
Sample datasets are available in the /data directory:
stat_analyzer_test.csvstat_analyzer_financial_risk.csvstat_analyzer_semi_structured.txt
These datasets are intentionally designed with:
- Missing values
- Outliers
- Mixed numeric types
- Real-world inconsistencies
The system automatically:
- Detects data types
- Identifies missing values and outliers
- Checks statistical assumptions
- Recommends appropriate tests
- Explains results in simple and advanced terms
- Links statistical outcomes to decisions
pip install -r requirements.txt
python app/main.py