English | Türkçe
You can use this project to extract information DOB (name, surname, date of birth, etc.) on the identity card. To do this, I'm broke down the problem into sub-problems as below:
- [this project] Identify Regions of Interest (ROI) containing the required information with deep learning
- [this project] Crop the regions identified above
- OCR on the identified region of interest
This project can do object detection + object classification + multiple object detection all at the same time.
Use case diagram
Sample id cards
| Sample 1 | Sample 2 | Sample 3 |
|---|---|---|
![]() |
![]() |
![]() |
This project has been refreshed after 7+ years. It now uses TensorFlow 2 (SavedModel) with optional OCR (EasyOCR).
- Removed TensorFlow 1 graph code and TF Object Detection API dependency. No more
object_detection.*imports. - Loads TF2 SavedModel from
model/saved_model/; automatically resolves and calls theserving_defaultsignature. id_card_detection_image.pyCLI flags:--image: Image path (absolute/relative)--min_score: Score threshold (0–1, default 0.60)--ocr: Run OCR (EasyOCR) on the cropped ROI
- Lightweight label map parser added; reads ID→name mapping from
data/labelmap.pbtxt. - Visualization via OpenCV; top-scoring box in green, others red. Cropped ROI is saved as
output_cropped.png.
Note: SavedModel output keys can vary between models. Defaults expect
detection_boxes,detection_scores,detection_classes. Adjust easily if different.
Install either a single platform-specific requirements file (includes TensorFlow), or install TensorFlow separately plus requirements-modern.txt. YOLO is optional and can be installed with ultralytics.
- Apple Silicon (macOS):
pip3 install -r requirements-macos-apple.txt- Intel macOS / Linux / Windows (CPU):
pip3 install -r requirements-cpu.txtInstall TensorFlow first, then the rest from requirements-modern.txt.
- macOS (Apple Silicon, M-series):
pip3 install tensorflow-macos==2.16.1 tensorflow-metal==1.2.0- macOS (Intel) or Linux (CPU):
pip3 install tensorflow==2.20.0- Windows (CPU):
pip3 install tensorflow==2.17.1Tip: On Python 3.12, keep pip up to date:
python3 -m pip install --upgrade pip
pip3 install -r requirements-modern.txtpython3 id_card_detection_image.py --image /absolute/or/relative/path.jpg --ocr --min_score 0.6- Windows example path:
C:\\path\\to\\image.jpg - Cropped ROI is written as
output_cropped.pngat project root. - With
--ocr, extracted text lines are printed to the terminal.
Install YOLO backend:
pip3 install ultralyticsRun with YOLO instead of TF2 (image):
python3 id_card_detection_image.py --image /path/to/img.jpg --yolo_model yolov8n.pt --min_score 0.4 --ocrRun with YOLO (camera) and enable OCR snapshot panel (keys 1–9 to OCR selected crop):
python3 id_card_detection_camera.py --yolo_model yolov8n.pt --min_score 0.4 --ocrCamera window controls: q quit, p pause/resume, s stop camera, b start camera, 1–9 OCR selected snapshot.
Image script (id_card_detection_image.py):
--image(string): Absolute or relative path to the image file. If omitted, defaults totest_images/image1.png.--min_score(float, default 0.60): Minimum confidence score threshold in [0,1] to visualize and crop detections. Lower it (e.g., 0.3–0.5) to see more candidates.--ocr(flag): If provided, runs EasyOCR on the cropped ROI and prints extracted text lines to the terminal.--yolo_model(string, optional): If provided, switches detector backend to YOLO. Accepts a local.ptpath or a model name likeyolov8n.pt. If omitted, TF2 SavedModel inmodel/saved_model/is used.
Camera script (id_card_detection_camera.py):
--camera(int, default 0): Video device index. Try 1 or 2 if you have multiple cameras.--min_score(float, default 0.50): Minimum confidence score threshold for drawing detections and listing snapshots.--yolo_model(string, optional): YOLO model path/name, same behavior as in the image script.--ocr(flag): Enables the OCR workflow on detected snapshots. When enabled:- The right panel shows up to 9 cropped detections sorted by score.
- Press number keys 1–9 to run OCR on the corresponding crop; results are shown under “OCR Result” in the right panel and printed to terminal.
Window hotkeys (camera):
q: Quit.p: Pause/Resume the live feed (freezes on the last frame when paused).s: Stop (release) the camera device.b: Start/restart the camera device.1–9: Run OCR on the Nth snapshot in the right panel (only if--ocrwas provided).
Outputs:
- Image script: draws boxes on the image window and saves the best ROI as
output_cropped.png(project root). OCR text lines print to terminal. - Camera script: draws boxes over the live feed; right panel lists thumbnails and shows OCR results if
--ocris enabled.
Requires Python 3.7 and TensorFlow 1.15. Hard to set up on modern systems.
pip3 install -r requirements.txt
python3 id_card_detection_image.pyWarning: TF1 deps (esp. on macOS/ARM) are incompatible with modern Python. Prefer Docker or a dedicated Python 3.7 env if needed.
- macOS Apple Silicon:
tensorflow-macos+tensorflow-metalrequired. First run may build font caches. - macOS Intel / Linux: CPU
tensorflowis sufficient; GPU not required. - Windows:
pip install tensorflow==2.17.1(CPU). Ensure Visual C++ Redistributable is installed if needed.
- Recommended flow (macOS Apple Silicon):
pip3 install tensorflow-macos==2.16.1 tensorflow-metal==1.2.0
pip3 install -r requirements-modern.txt
python3 id_card_detection_image.py --image /Users/you/Downloads/test_data.jpg --ocr --min_score 0.6- Recommended flow (Linux / macOS Intel):
pip3 install tensorflow==2.20.0
pip3 install -r requirements-modern.txt
python3 id_card_detection_image.py --image ./test_images/image1.png --min_score 0.6



