Skip to content
View Mario928's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report Mario928

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Mario928/Readme.md

Hi there! I'm Prashant πŸ‘‹

AI/ML Engineer | MS @NYU πŸŽ“| Ex-LTIMindtree

πŸš€ About Me

I'm passionate about building AI solutions and LLM-powered systems. Previously worked at LTIMindtree developing enterprise AI applications.

πŸ”­ I enjoy working on:

β€’ πŸ€– Large Language Models (LLMs)
β€’ 🧠 AI/ML Systems
β€’ πŸ“ Natural Language Processing

🌟 Open Source Contributions:

πŸ”— LangChain

Enhanced the framework by integrating a Java language code parser, enabling simplified parsing of Java source code for LLM consumption and improving code analysis capabilities.

Contributed to this LLM deployment toolkit by resolving a critical runtime issue in NeoX-based models, improving the project's model serving capabilities.

πŸ“« Let's Connect!

Feel free to reach out: prashant.shihora@nyu.edu

Pinned Loading

  1. Large-Scale-Training-Inference-and-Continuous-Deployment-Platform Large-Scale-Training-Inference-and-Continuous-Deployment-Platform Public

    Forked from theomthakur/ece-gy-9183-group19

    Jupyter Notebook

  2. CUDA-performance-optimization-project CUDA-performance-optimization-project Public

    GPU optimization techniques for ML: memory coalescing, tiling, and convolution implementations

    Cuda

  3. edge-ml-optimization edge-ml-optimization Public

    Post-training quantization (PTQ) for edge deployment. INT8 implementation with per-channel scaling, activation calibration, and bias correction for 4x compression with <1% accuracy loss.

    Jupyter Notebook

  4. multi-gpu-training multi-gpu-training Public

    PyTorch DDP implementation exploring gradient synchronization and scaling dynamics on multi-GPU clusters. Benchmarks communication bottlenecks and large-batch convergence strategies.

    Python

  5. Low-Rank-RoBERTa Low-Rank-RoBERTa Public

    Repository for Low-Rank RoBERTa: Achieving High Accuracy Under 1M Parameters

    Jupyter Notebook

  6. speculative-moe speculative-moe Public

    Python