Skip to content

Fixed Main.c #9

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 18 commits into
base: main
Choose a base branch
from
Open

Fixed Main.c #9

wants to merge 18 commits into from

Conversation

Advaitgaur004
Copy link
Contributor

@Advaitgaur004 Advaitgaur004 commented Mar 6, 2025

Description

This PR implements proper fixes and enhancements to the neural network library, with a focus on correcting the main.c training loop and gradient computation.

Changes

Core Operations

  • Added Tensor_transpose function for matrix transposition
  • Fixed GradFn_matmul to properly handle gradient computation using transposition
  • Added Tensor_sub operation with appropriate gradient functions
  • Enhanced broadcasting to properly handle scalar broadcasting

Neural Network Components

  • Added nn_random_init function implementing Xavier/Glorot initialization
  • Fixed ReLU gradient function to properly use 0.0f/1.0f values
  • Corrected softmax gradient computation to handle batched inputs properly
  • Added GradNode naming for better debugging and gradient tracking

Loss Functions

  • Fixed nn_crossentropy to avoid numerical issues by adding epsilon
  • Added nn_softmax_crossentropy to stabilize training
  • Implemented proper gradients for both loss functions

Training Improvements

  • Added data normalization to preprocess inputs
  • Implemented dataset shuffling for better training
  • Enhanced batch handling with proper actual_batch_size calculations
  • Improved learning rate from 0.001 to 0.01
  • Added epoch average loss tracking for monitoring

Memory Management

  • Fixed memory allocation and deallocation
  • Better variable initialization

@Advaitgaur004 Advaitgaur004 changed the title Add nn_random_init function for tensor initialization and update model weight initialization Fix Main.c Mar 9, 2025
@Advaitgaur004
Copy link
Contributor Author

image
Works !!

@Advaitgaur004 Advaitgaur004 changed the title Fix Main.c Fixed Main.c Mar 17, 2025
Reason : With MSVC compiler...One will seeing consistent -431602080.0000 values (likely MSVC's debug heap initialization pattern)

The key problem is that after allocating memory for the tensor data, the code doesn't initialize the actual floating-point values. It only sets the numel field of the FloatBuffer structure.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant