SanketBani - A ISL SignLanguage-based app

SanketBani is an innovative application designed to bridge the communication gap between hearing and speech-impaired communities and those who can hear and speak. It leverages advanced AI and 3D technology to convert text, images, and real-time gestures into Indian Sign Language (ISL) gestures, fostering seamless communication and understanding.

Features

1. Text-to-ISL Conversion

Converts any input text into 3D animated Indian Sign Language gestures.
Allows users to type or paste text and view the corresponding ISL gestures in real time.

2. Image-to-ISL Conversion

Recognizes text from images and converts it into ISL gestures.
Uses advanced Optical Character Recognition (OCR) and AI models to extract and process text from images.

3. Real-Time Gesture Conversations

Facilitates real-time conversations between individuals who can and cannot hear or speak.
Provides smooth interaction using gesture recognition and 3D animated ISL responses.

4. Gesture-to-Text Conversion

Converts sign language gestures from videos into readable text.
Enables understanding of ISL gestures by those who are unfamiliar with sign language.

5. Custom Dataset

Incorporates a structured dataset of ISL gestures, including alphabets, numbers, common sentences, and frequently used words.
Expands functionality by combining gestures for new or uncommon phrases.

Technology Stack

1. Frontend

React Native for the mobile application.
Three.js for rendering 3D models.
Expo Go for development and testing.

2. Backend

Node.js for API handling.
APIs for OCR, text-to-sign gesture generation, and other AI functionalities.

3. Recognition Model

Open CV and Mediapipe used to Extract pose and features from videos.
TensorFlow and Python used for creating and training neural networks.

Folder Structure and Setup

The SanketBani project consists of three main folders:

1. Frontend

Contains the React Native application.
Guides for setup:
```
cd frontend
npm install
```
For the first build, run:
```
npm run android
```
For subsequent runs:
```
npm run start
```

2. Backend

Contains the Node.js server for handling APIs and other backend functionalities.
Guides for setup:
```
cd backend
npm install
npm run dev
```

3. Recognition Model

Contains the Python-based model for video gesture recognition.

Guides for setup:

cd Recognition Model
cd Video Recognition
pip install -r requirements.txt
python app.py

Environment Variables

To ensure proper functioning, create .env files in both the frontend and backend folders with the following structure:

Frontend `.env` Example

API_URL=https://example.com/api // your backend url

Backend `.env` Example

PORT=5000
MONGO_URI=mongodb+srv://user:[email protected]/dbname
JWT_SECRET=your_jwt_secret
EMAIL = [email protected]
PASSWORD = "addf ada sdss wdsd" // this is for nodemailer
CLOUDINARY_CLOUD_NAME = "example"
CLOUDINARY_API_KEY = "example"
CLOUDINARY_API_SECRET = "example"
GOOGLE_APPLICATION_CREDENTIALS = "your json file path"
GEMINI_API_KEY = "example"
ELEVENLABS_API_KEY = "example"

How It Works

Input Processing

The user provides input via text, image, or gestures.
Text is processed to correct grammar and spelling.
a sequence for ISL gestures is generated if the input matches the dataset.
the AI combines existing gestures or generates a new sequence for new inputs.

Output Rendering

Processed input is converted into a series of prebuilt 3D animations.
Animations are displayed in the app’s 3D viewer for users to follow or understand.

Machine Learning

Converting Videos into Sentences

To convert videos into sentences, we built a repository of 15 sentences, as stated below. To convert each video, we used opencv and Mediapipe to extract the pose and features from the videos and save it into an .npy file. These files were then trained based on an LTSM neural network, as a classification model. Currently the App only supports 15 sentences, but we are working to add more.

- 'Are you free today' 
- 'Can you repeat that please'
- 'Congratulations' 
- 'Help me please' 
- 'How are you' 
- 'I am fine'
- 'I love you' 
- 'No' 
- 'Please come, Welcome' 
- 'Talk slower please'
- 'Thank you' 
- 'What are you doing' 
- 'What do you do'
- 'What happened' 
- 'Yes'

Converting Sentences into 3D Animations

Currently this is done by using an API with a Large Language Model, where the classification task is passed as the context. We are working to make an embedded and lightweight sentence classification solution, that maps our sentences to animation data.

Additional Resources

DataSet
Recognition Model

Use Cases

Education: Helps students interactively learn Indian Sign Language.
Accessibility: Enables hearing and speech-impaired individuals to communicate with those who do not know sign language.
Translation: Acts as a translator for conversations between different communities.
Awareness: Promotes ISL adoption and awareness in workplaces, schools, and public spaces.

Challenges Solved

Communication Barriers: Reduces the gap between communities with and without knowledge of ISL.
Resource Scarcity: Provides prebuilt animations for over 285+ gestures, including alphabets, numbers, and common phrases.
Ease of Use: User-friendly interface for inputting and viewing ISL gestures in various formats.

Installation

Prerequisites

Node.js
Python (for recognition model)
Expo Go

Steps

Clone the repository:

git clone https://github.com/Phinix-BI/SanketBani

Navigate to the project directory:
```
cd ISL-Recognition
```
Set up each folder per the Folder Structure and Setup section.

Roadmap

Future Enhancements

Support for more languages beyond English.
Additional gesture datasets and animations.
Offline functionality for remote areas.
Integration with voice recognition for enhanced accessibility.
real-time gesture conversion on video call

Contributions

We welcome contributions to enhance SanketBani. To contribute:

Fork the repository.
Create a new branch for your feature or bug fix.
Submit a pull request with a detailed description of your changes.

License

SanketBani is licensed under the MIT License. See the LICENSE file for details.

Contact

For queries, feature requests, or support:

Email: [email protected]
Website: www.sanketbani.com

Name		Name	Last commit message	Last commit date
Latest commit History 222 Commits
.idea		.idea
Backend		Backend
LandingPage		LandingPage
Recognition Model		Recognition Model
client		client
frontend		frontend
uploads		uploads
.gitignore		.gitignore
readme.md		readme.md

Phinix-BI/SanketBani

Folders and files

Latest commit

History

Repository files navigation

SanketBani - A ISL SignLanguage-based app

Features

1. Text-to-ISL Conversion

2. Image-to-ISL Conversion

3. Real-Time Gesture Conversations

4. Gesture-to-Text Conversion

5. Custom Dataset

Technology Stack

1. Frontend

2. Backend

3. Recognition Model

Folder Structure and Setup

1. Frontend

2. Backend

3. Recognition Model

Environment Variables

Frontend .env Example

Backend .env Example

How It Works

Input Processing

Output Rendering

Machine Learning

Converting Videos into Sentences

Converting Sentences into 3D Animations

Additional Resources

Use Cases

Challenges Solved

Installation

Prerequisites

Steps

Roadmap

Future Enhancements

Contributions

License

Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Frontend `.env` Example

Backend `.env` Example

Packages