This repository contains a deep learning model for fake scene classification, utilizing data from CIDAUT AI Fake Scene Classification 2024. The notebook includes Python code for loading data, applying data augmentations, and training a model using PyTorch.
- Introduction
- Installation
- Data Loading
- Data Augmentations
- Advanced Custom Dataset Class
- Running the Code
- Gaussian Mixture Model (GMM)
- Results
- Usage
- License
Fake scene classification is a challenging task in computer vision. This project aims to detect fake scenes using advanced deep learning techniques and data augmentation strategies.
To set up the environment, install the required dependencies:
pip install -r requirements.txtThe dataset is loaded and preprocessed using PyTorch DataLoader.
from torch.utils.data import DataLoader
data_loader = DataLoader(dataset, batch_size=32, shuffle=True)We apply various augmentations to improve model generalization:
- Random Resizing
- Horizontal Flipping
- Color Jittering
- Gaussian Noise Injection
import torchvision.transforms as transforms
transform = transforms.Compose([
transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ColorJitter(),
transforms.ToTensor()
])A custom dataset class is implemented for more control over data processing:
from torch.utils.data import Dataset
class CustomDataset(Dataset):
def __init__(self, data, labels, transform=None):
self.data = data
self.labels = labels
self.transform = transform
def __getitem__(self, index):
img, label = self.data[index], self.labels[index]
if self.transform:
img = self.transform(img)
return img, label
def __len__(self):
return len(self.data)To train the model, execute:
python train.pyA GMM-based approach is implemented to analyze clusters of fake and real images:
from sklearn.mixture import GaussianMixture
gmm = GaussianMixture(n_components=2, random_state=42)
gmm.fit(features)The final trained model achieves 97% accuracy on the validation set.
To test a new image, run:
python predict.py --image sample.jpgThis project is licensed under the MIT License.
Note: This code is intended for educational and research purposes only and may not be optimized for production use.