Skip to content

Pixelrick420/Handwritten-Digits

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Handwritten Digit Recognition using a Fully Connected Neural Network

Overview

This project implements a handwritten digit recognition system trained on the MNIST dataset using a custom neural network built from scratch with NumPy. A GUI built with Tkinter allows users to draw digits for real-time prediction using the trained model.

I learned all of the maths for implementing this from 3blue1brown

Dataset

The MNIST dataset consists of 60,000 training images and 10,000 testing images. Each image is a grayscale 28x28 pixel representation of a handwritten digit (0–9). The dataset is used in CSV format where the first column represents the digit label and the remaining 784 columns represent pixel intensities.

Architecture

The neural network follows a feedforward architecture with the following layers:

  • Input Layer: 784 neurons (corresponding to the 28×28 image size)
  • First Hidden Layer: 1024 neurons with ReLU activation
  • Second Hidden Layer: 256 neurons with ReLU activation
  • Output Layer: 10 neurons with Softmax activation

The model is trained using mini-batch gradient descent with a batch size of 64.

Mathematics

ReLU Activation

The ReLU (Rectified Linear Unit) activation function is defined as:

$$\text{ReLU}(x) = \max(0, x)$$

Softmax Activation

The softmax function outputs a probability distribution over classes:

$$\text{Softmax}(z_i) = \frac{e^{z_i - \max(z)}}{\sum_j e^{z_j - \max(z)}}$$

Forward Pass

For an input vector ( X ), the forward propagation steps are:

$$Z^{[1]} = W^{[1]}X + b^{[1]}$$ $$A^{[1]} = \text{ReLU}(Z^{[1]})$$ $$Z^{[2]} = W^{[2]}A^{[1]} + b^{[2]}$$ $$A^{[2]} = \text{ReLU}(Z^{[2]})$$ $$Z^{[3]} = W^{[3]}A^{[2]} + b^{[3]}$$ $$A^{[3]} = \text{Softmax}(Z^{[3]})$$

Loss Function

The loss function used is cross-entropy loss:

$$\mathcal{L} = -\frac{1}{m} \sum_{i=1}^{m} Y^{(i)} \log(A^{[3](i)})$$

Backpropagation

The gradients are computed as follows:

$$dZ^{[3]} = A^{[3]} - Y$$ $$dW^{[3]} = \frac{1}{m} dZ^{[3]} A^{[2]^T}$$ $$db^{[3]} = \frac{1}{m} \sum dZ^{[3]}$$ $$dA^{[2]} = W^{[3]^T} dZ^{[3]}$$ $$dZ^{[2]} = dA^{[2]} * \text{ReLU}'(Z^{[2]})$$ $$dW^{[2]} = \frac{1}{m} dZ^{[2]} A^{[1]^T}$$ $$db^{[2]} = \frac{1}{m} \sum dZ^{[2]}$$ $$dA^{[1]} = W^{[2]^T} dZ^{[2]}$$ $$dZ^{[1]} = dA^{[1]} * \text{ReLU}'(Z^{[1]})$$ $$dW^{[1]} = \frac{1}{m} dZ^{[1]} X^T$$ $$db^{[1]} = \frac{1}{m} \sum dZ^{[1]}$$

GUI Application

The GUI allows the user to draw a digit, which is then:

  • Centered using the center of mass
  • Denoised by removing low intensity noise
  • Normalized using training set statistics
  • Passed through the neural network for classification

Features

  • Real-time digit drawing on a canvas
  • Display of predicted digit
  • Uses saved model parameters from an .npz file

Libraries Used

  • numpy
  • tkinter
  • scipy.ndimage

File Structure

  • recognition.npz: Contains all model weights, biases, and normalization parameters
  • app.py: Tkinter-based GUI for drawing and recognizing digits
  • recognition.py: Neural network training and saving

Accuracy

  • Final test accuracy: ~98%
  • Training accuracy is evaluated at regular intervals during training

Example GUI & Output

GUI

image

Output

  • Training log

image

  • Drawing '5' on the canvas yields prediction: 5

image

  • Drawing '3' yields prediction: 3

image

Usage

  1. Train the model by running:
python recognition.py

This will generate a recognition.npz file containing all model parameters.

  1. Launch the GUI by running:
python app.py

Ensure the model file (recognition.npz) is in the same directory.

Future Work

  • Improve accuracy with convolutional layers
  • Save training progress with checkpoints
  • Allow user to load their own images
  • Experiment with different layer sizes and architectures

License

This project is open-source and free to use

About

Neural Network for recognizing handwritten digits based on MNIST dataset

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages