Skip to content

hiwakurdy/CNN_PROJ_k2_to_k10_V2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CNN_PROJ_KU_AR

Kurdish-vs-Arabic word-image classification project with CNN-based models, projection features, contextual postprocessing, and saved local/external evaluation artifacts.

Start Here

If a supervisor or reviewer opens this repository, the best reading order is:

  1. docs/SUPERVISOR_GUIDE.md
  2. docs/model_architectures.md
  3. outputs/ann_v1_2class_01/all/report_all.md
  4. outputs/kparts_sweep_evarest_real_eval_fixed/ARTICLE_REPORT.md

The supervisor guide explains:

  • where the project starts from annotated JSON data,
  • which file is used for what,
  • where the legacy K=3,4 stage ends,
  • where the newer K=2..10 sweep begins,
  • and where the final local and EvArEST results are stored.

What This Repo Contains

  • src/: reusable training, prediction, preprocessing, and model code
  • scripts/: local dataset preparation and external dataset helpers
  • experiments/: evaluation, KNN/SVM analysis, article reports, and K=2..10 sweep drivers
  • data/anotatd_lines/: local annotated source images and JSON files
  • outputs/checkpoints/: public best checkpoints
  • outputs/training/: per-run configs, histories, summaries, and checkpoints
  • outputs/ann_v1_2class_01/: legacy local annotated evaluation outputs
  • outputs/kparts_sweep_evarest_real_eval_fixed/: saved EvArEST evaluation outputs

Project Phases

Legacy phase

The earlier version mainly compared manually chosen projection settings, especially K_parts=3 and K_parts=4, on your local annotated data. The main saved outputs for this phase are in outputs/ann_v1_2class_01/.

New phase

The newer version changes the goal from fixed K=3,4 comparison to a systematic search over K_parts=2..10. The main saved outputs for this phase are the run folders in outputs/training/ and the external evaluation reports in outputs/kparts_sweep_evarest_real_eval_fixed/.

Workflow In One View

  1. Local annotations start in data/anotatd_lines/outs/*.json.
  2. scripts/prepare_local_datasets.py converts annotations into cropped class-folder datasets.
  3. src/train.py trains models and writes run artifacts to outputs/training/.
  4. Legacy fixed-K models are evaluated on local annotated data.
  5. The newer K=2..10 sweep is driven by experiments/run_kparts_sweep.py.
  6. KNN and SVM are applied as contextual postprocessing during evaluation.
  7. External Arabic-only transfer is tested on EvArEST through experiments/evaluate_kparts_sweep_real_data.py.

Important Files

  • src/config.py: runtime defaults for data paths and model settings
  • src/dataset.py: class-folder loading, preprocessing, and train/val/test splitting
  • src/train.py: main training entrypoint
  • scripts/prepare_local_datasets.py: local annotation JSON to cropped training folders
  • experiments/run_kparts_sweep.py: K_parts=2..10 sweep driver
  • experiments/report_kparts_sweep.py: sweep summary generator
  • experiments/evaluate_ann_v1_2class.py: legacy local annotated evaluation
  • experiments/evaluate_ann_v1_2class_svm_01.py: local annotated evaluation with SVM postprocessing
  • experiments/evaluate_kparts_sweep_real_data.py: external real-data evaluation for sweep runs

Reproducibility Note

This repository preserves code, saved checkpoints, saved run configurations, saved histories, and saved evaluation outputs. However, data/prepared/ is not currently present in this checkout, so exact retraining requires rebuilding or restoring the prepared cropped datasets first.

Minimal Setup

These commands assume PowerShell from the repo root.

C:\ProgramData\miniconda3\python.exe -m venv .venv
.\.venv\Scripts\python.exe -m pip install --upgrade pip
.\.venv\Scripts\python.exe -m pip install -r requirements.txt
.\.venv\Scripts\python.exe scripts\prepare_local_datasets.py

Example Commands

Run a prediction with a saved checkpoint:

.\.venv\Scripts\python.exe src\predict.py `
  --img "data\prepared\3class\arabic\C_F0_S26_W2__ann_00001.png" `
  --model projections `
  --num-classes 3

Run batch prediction on a folder:

.\.venv\Scripts\python.exe src\predict_batch.py `
  --dir data\anotatd_lines\outs\2_classes\v1_2classes\images `
  --out outputs\tst_results `
  --model projections `
  --num-classes 2

Train on the prepared local 2-class dataset:

.\.venv\Scripts\python.exe src\train.py --model projections

Run the newer K=2..10 sweep:

.\.venv\Scripts\python.exe experiments\run_kparts_sweep.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages