Skip to content

A Window Tool App using PhoWhisper-VinAI and Whisper OpenAI to gen Vietnamese Text from Audio file

Notifications You must be signed in to change notification settings

Phuc75nguyen/Speech2Text4Vietnamese

Repository files navigation

🚀 Python Desktop Tool – Mp3toText

The Mp3toText tool automatically processes Vietnamese audio files and converts them into text.
With a very simple interface, just a few clicks allow users to turn full conversations into text for practical use.


✨ Key Features

  • 📝 Choose between Whisper (OpenAI) or PhoWhisper (VinAI) models, downloaded locally.
  • Automatically transcribe audio into text before feeding it into any system.
  • 🔄 Speaker gender detection (male vs female) using clustering based on voice frequency (Hz).
  • 🔄 Export results with a clear and intuitive display.
  • 🖥️ User-friendly interface, no programming skills required.
  • 🛠️ Compare models: users can evaluate whether OpenAI’s Whisper or VinAI’s PhoWhisper transcribes Vietnamese more accurately.

📖 Approach

Whisper Model – OpenAI

Whisper-OpenAI

PhoWhisper Model – VinAI

📄 View PhoWhisper Paper (VinAI)


📸 Application Interface

Main screen on launch

App Launch

Processing data with models

PhoWhisper Processing
Whisper Processing

Final transcription results

PhoWhisper Result
Whisper Result


🛠️ How to Use

  1. Prepare your audio file in formats like .mp3 or .mp4 (example: fileAudio.mp3).
  2. Open the application Mp3toText.exe.
  3. Select the audio file to process.
  4. Click Run to start automatic transcription.
  5. View results directly on the interface or from the output file.

📦 Installation

  1. Clone the repository:
    git clone https://github.com/Phuc75nguyen/MP3toText.git
    cd AutoData4FA
  2. Set up a virtual environment:
    python -m venv venv
    venv\Scripts\activate
  3. Build with PyInstaller in VS Code:
    python -m PyInstaller --noconfirm --onefile --windowed --name MP3toText --icon=app.ico --add-data "app.ico;." app.py
  4. Run the app:
    python app.py
    

❤️ Made with Tons of Love (Tan Phuc)

About

A Window Tool App using PhoWhisper-VinAI and Whisper OpenAI to gen Vietnamese Text from Audio file

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages