Skip to content

SanketBani bridges the gap between hearing and speech-impaired individuals and others by translating text, images, and gestures into 3D Indian Sign Language (ISL) animations. It’s a simple, inclusive solution for seamless communication.

Notifications You must be signed in to change notification settings

Phinix-BI/SanketBani

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SanketBani - A ISL SignLanguage-based app

SanketBani is an innovative application designed to bridge the communication gap between hearing and speech-impaired communities and those who can hear and speak. It leverages advanced AI and 3D technology to convert text, images, and real-time gestures into Indian Sign Language (ISL) gestures, fostering seamless communication and understanding.


Features

1. Text-to-ISL Conversion

  • Converts any input text into 3D animated Indian Sign Language gestures.
  • Allows users to type or paste text and view the corresponding ISL gestures in real time.

2. Image-to-ISL Conversion

  • Recognizes text from images and converts it into ISL gestures.
  • Uses advanced Optical Character Recognition (OCR) and AI models to extract and process text from images.

3. Real-Time Gesture Conversations

  • Facilitates real-time conversations between individuals who can and cannot hear or speak.
  • Provides smooth interaction using gesture recognition and 3D animated ISL responses.

4. Gesture-to-Text Conversion

  • Converts sign language gestures from videos into readable text.
  • Enables understanding of ISL gestures by those who are unfamiliar with sign language.

5. Custom Dataset

  • Incorporates a structured dataset of ISL gestures, including alphabets, numbers, common sentences, and frequently used words.
  • Expands functionality by combining gestures for new or uncommon phrases.

Technology Stack

1. Frontend

  • React Native for the mobile application.
  • Three.js for rendering 3D models.
  • Expo Go for development and testing.

2. Backend

  • Node.js for API handling.
  • APIs for OCR, text-to-sign gesture generation, and other AI functionalities.

3. Recognition Model

  • Open CV and Mediapipe used to Extract pose and features from videos.
  • TensorFlow and Python used for creating and training neural networks.

Folder Structure and Setup

The SanketBani project consists of three main folders:

1. Frontend

  • Contains the React Native application.
  • Guides for setup:
    cd frontend
    npm install
  • For the first build, run:
    npm run android
  • For subsequent runs:
    npm run start

2. Backend

  • Contains the Node.js server for handling APIs and other backend functionalities.
  • Guides for setup:
    cd backend
    npm install
    npm run dev

3. Recognition Model

  • Contains the Python-based model for video gesture recognition.
  • Guides for setup:
    cd Recognition Model
    cd Video Recognition
    pip install -r requirements.txt
    python app.py

Environment Variables

To ensure proper functioning, create .env files in both the frontend and backend folders with the following structure:

Frontend .env Example

API_URL=https://example.com/api // your backend url

Backend .env Example

PORT=5000
MONGO_URI=mongodb+srv://user:[email protected]/dbname
JWT_SECRET=your_jwt_secret
EMAIL = [email protected]
PASSWORD = "addf ada sdss wdsd" // this is for nodemailer
CLOUDINARY_CLOUD_NAME = "example"
CLOUDINARY_API_KEY = "example"
CLOUDINARY_API_SECRET = "example"
GOOGLE_APPLICATION_CREDENTIALS = "your json file path"
GEMINI_API_KEY = "example"
ELEVENLABS_API_KEY = "example"

How It Works

Input Processing

  1. The user provides input via text, image, or gestures.
  2. Text is processed to correct grammar and spelling.
  3. a sequence for ISL gestures is generated if the input matches the dataset.
  4. the AI combines existing gestures or generates a new sequence for new inputs.

Output Rendering

  • Processed input is converted into a series of prebuilt 3D animations.
  • Animations are displayed in the app’s 3D viewer for users to follow or understand.

Machine Learning

Converting Videos into Sentences

To convert videos into sentences, we built a repository of 15 sentences, as stated below. To convert each video, we used opencv and Mediapipe to extract the pose and features from the videos and save it into an .npy file. These files were then trained based on an LTSM neural network, as a classification model. Currently the App only supports 15 sentences, but we are working to add more.

- 'Are you free today' 
- 'Can you repeat that please'
- 'Congratulations' 
- 'Help me please' 
- 'How are you' 
- 'I am fine'
- 'I love you' 
- 'No' 
- 'Please come, Welcome' 
- 'Talk slower please'
- 'Thank you' 
- 'What are you doing' 
- 'What do you do'
- 'What happened' 
- 'Yes'

Converting Sentences into 3D Animations

Currently this is done by using an API with a Large Language Model, where the classification task is passed as the context. We are working to make an embedded and lightweight sentence classification solution, that maps our sentences to animation data.

Additional Resources

DataSet
Recognition Model

Use Cases

  • Education: Helps students interactively learn Indian Sign Language.
  • Accessibility: Enables hearing and speech-impaired individuals to communicate with those who do not know sign language.
  • Translation: Acts as a translator for conversations between different communities.
  • Awareness: Promotes ISL adoption and awareness in workplaces, schools, and public spaces.

Challenges Solved

  • Communication Barriers: Reduces the gap between communities with and without knowledge of ISL.
  • Resource Scarcity: Provides prebuilt animations for over 285+ gestures, including alphabets, numbers, and common phrases.
  • Ease of Use: User-friendly interface for inputting and viewing ISL gestures in various formats.

Installation

Prerequisites

  • Node.js
  • Python (for recognition model)
  • Expo Go

Steps

  1. Clone the repository:
    git clone https://github.com/Phinix-BI/SanketBani
  2. Navigate to the project directory:
    cd ISL-Recognition
  3. Set up each folder per the Folder Structure and Setup section.

Roadmap

Future Enhancements

  • Support for more languages beyond English.
  • Additional gesture datasets and animations.
  • Offline functionality for remote areas.
  • Integration with voice recognition for enhanced accessibility.
  • real-time gesture conversion on video call

Contributions

We welcome contributions to enhance SanketBani. To contribute:

  • Fork the repository.
  • Create a new branch for your feature or bug fix.
  • Submit a pull request with a detailed description of your changes.

License

SanketBani is licensed under the MIT License. See the LICENSE file for details.


Contact

For queries, feature requests, or support:

About

SanketBani bridges the gap between hearing and speech-impaired individuals and others by translating text, images, and gestures into 3D Indian Sign Language (ISL) animations. It’s a simple, inclusive solution for seamless communication.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •