Skip to content

Hamna-Kaleem/Speech-Signals-Features

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Speech Signal Processing - Fourier Transform, Spectrograms, and MFCCs

This repository provides an advanced-level implementation of speech signal processing, focusing on Fourier Transform, Spectrograms, and MFCCs.

📌 Features

Fourier Transform & Spectrograms: Convert speech signals from time-domain to frequency-domain.

Mel-Frequency Cepstral Coefficients (MFCCs): Extract meaningful features for speech recognition.

What is the Fourier Transform?

Sound waves are typically represented in the time domain (waveforms), but analyzing their frequency components is crucial. The Fourier Transform (FT) converts a time-domain signal into its frequency components.

The Short-Time Fourier Transform (STFT) is commonly used in speech processing to create spectrograms, which display how frequencies change over time.

Spectrograms: A spectrogram is a visual representation of sound frequencies of a signal as it varies with time. Unlike waveforms that show amplitude over time, spectrograms reveal the frequency content.

What are Mel-Frequency Cepstral Coefficients MFCCs?

MFCCs are widely used in speech recognition as they mimic how humans perceive sound. The human ear is more sensitive to certain frequencies, so MFCCs use a Mel scale to focus on perceptually important features. They represent the speech signal's spectral properties in a way that mimics human auditory perception.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages