Skip to content

Latest commit

 

History

History
102 lines (36 loc) · 1.55 KB

File metadata and controls

102 lines (36 loc) · 1.55 KB

# Python & SQL Server Orders Analysis

This project demonstrates loading order data from a CSV file into SQL Server

and analysing it using SQL queries and Python.

The focus of the project is on validating order data and producing reliable

summary metrics using a combination of SQL and Python.


## Problem

Order data sourced from flat files (such as CSVs) often contains inconsistencies

that make analysis unreliable without additional validation.

Common issues include:

- duplicate order records

- missing or invalid date values

- discrepancies between stored and calculated totals

The aim of this project is to identify and address these issues so that order

and revenue summaries are based on trustworthy data.


## Approach

1. **Load data into SQL Server**

  - Imported order data from a CSV file into a SQL Server table

  - Ensured appropriate data types for analysis

2. **Validate and analyse using SQL**

  - Ran summary queries to calculate order counts and totals

  - Investigated potential data quality issues such as duplicates and null values

3. **Use Python for additional analysis**

  - Connected to SQL Server using pyodbc

  - Queried and analysed the data using Python

  - Cross‑checked SQL results to ensure consistency


## Tech Stack

- Python

- SQL Server

- pyodbc


## Current Status

Initial data load and exploratory analysis complete.

Further validation checks and analysis can be added as the project evolves.