Skip to content

Raajarapu/stat-analyzer

Repository files navigation

Stat-Analyzer

PhD-level statistical analysis engine for structured and semi-structured data, focused purely on statistics, probability, and decision science.

Stat Analyzer 📊🧠

Stat Analyzer is a PhD-level statistical analysis and decision science engine designed to work with structured and semi-structured data.
It focuses purely on Statistics and Probability, delivering deep reasoning, diagnostics, and decision-oriented insights.

🚫 No machine learning model performance
✅ 100% statistics, inference, probability, and decision science


🎯 Purpose

Stat Analyzer is built to behave like a PhD-educated statistician that can:

  • Understand messy real-world data
  • Decide which statistical techniques apply
  • Explain why they apply
  • Interpret results correctly
  • Translate statistics into business and research decisions

🔍 What This Project Covers

📌 Descriptive Statistics

  • Mean, Median, Mode
  • Variance, Standard Deviation
  • Range, IQR
  • Skewness, Kurtosis
  • Frequency distributions

📌 Probability & Uncertainty

  • Probability rules
  • Conditional probability
  • Bayes theorem
  • Permutations & combinations
  • Certainty vs uncertainty reasoning

📌 Statistical Distributions

  • Normal
  • Binomial
  • Poisson
  • Uniform
  • Exponential
  • Chi-square
  • Student’s t
  • F distribution

📌 Inferential Statistics

  • Central Limit Theorem
  • Confidence intervals
  • Hypothesis testing
  • P-values
  • Type I & Type II errors

📌 Statistical Tests

  • Z-test
  • One-sample & two-sample t-tests
  • Paired t-test
  • ANOVA
  • F-test
  • Chi-square test
  • Non-parametric alternatives

📌 Regression & Relationships

  • Correlation analysis
  • Simple & multiple linear regression
  • Ordinary Least Squares (OLS)
  • Residual diagnostics
  • Omnibus test
  • Assumption validation

📌 Error & Diagnostic Analysis

  • Mean Error
  • Mean Absolute Error
  • Mean Percentage Error
  • Absolute & percentage errors
  • Error-based decision guidance

📌 Analytics Types

  • Descriptive analysis
  • Diagnostic analysis
  • Predictive analysis (statistics-only)
  • Prescriptive analysis

📂 Supported Data Formats

  • CSV files
  • Excel files
  • Semi-structured text tables
  • Mixed data types (int, float, categorical, missing values, outliers)

🧪 Sample Datasets

Sample datasets are available in the /data directory:

  • stat_analyzer_test.csv
  • stat_analyzer_financial_risk.csv
  • stat_analyzer_semi_structured.txt

These datasets are intentionally designed with:

  • Missing values
  • Outliers
  • Mixed numeric types
  • Real-world inconsistencies

🖥️ Application Behavior

The system automatically:

  • Detects data types
  • Identifies missing values and outliers
  • Checks statistical assumptions
  • Recommends appropriate tests
  • Explains results in simple and advanced terms
  • Links statistical outcomes to decisions

🚀 How to Run (Basic)

pip install -r requirements.txt
python app/main.py

About

PhD-level statistical analysis engine for structured and semi-structured data, focused purely on statistics, probability, and decision science.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages