Skip to content

Enhanced Segmentation Capabilities with Annotation Export capabilities#106

Open
fyzanahammad wants to merge 2 commits into
luca-medeiros:mainfrom
fyzanahammad:segment-masks
Open

Enhanced Segmentation Capabilities with Annotation Export capabilities#106
fyzanahammad wants to merge 2 commits into
luca-medeiros:mainfrom
fyzanahammad:segment-masks

Conversation

@fyzanahammad
Copy link
Copy Markdown

Enhanced Segmentation Capabilities with Annotation Export

Overview

This PR enhances Lang-SAM with comprehensive annotation export capabilities and batch processing improvements, making it more suitable for production computer vision pipelines.

Key Features Added

Annotation Export System

  • Multiple Format Support: Added export capabilities for COCO JSON and YOLO formats
  • Precise Segmentation Masks: Implemented polygon extraction from binary masks using OpenCV contours
  • Dual YOLO Output: Standard bounding box format and segmentation format (compatible with YOLOv8-seg)

Batch Processing Enhancements

  • Organized Output Structure: Created dedicated folders for images, bounding boxes, and masks
  • Consistent File Naming: Implemented uniform naming across output types for better traceability
  • Consolidated Class Mapping: Single classes.txt file for the entire batch

Performance & UX Improvements

  • Progress Tracking: Added real-time progress indicators with percentage completion
  • Optimized Storage: Optional mask image saving to reduce disk usage
  • Robust Error Handling: Graceful continuation when individual images fail
  • GPU Support: Added Windows-specific requirements file with CUDA support

Technical Implementation Details

Annotation Processing

  • Used OpenCV for mask-to-polygon conversion with contour detection
  • Implemented proper normalization for YOLO format coordinates
  • Added area calculation from binary masks for COCO format

Architecture Changes

  • Modified inference pipeline to use direct model access for reliability
  • Added error handling in server endpoints
  • Created structured output directories with timestamp-based naming

Windows-Specific Support

  • Added win_requirements.txt with proper CUDA-enabled PyTorch installation
  • Fixed compatibility issues with Windows paths and Python versions

Testing

The implementation has been tested with various image inputs and prompts, confirming:

  • Correct generation of annotation files in both formats
  • Proper polygon extraction from segmentation masks
  • Successful batch processing with progress tracking
  • GPU acceleration with CUDA support

Future Work

Potential future enhancements could include:

  • Additional annotation formats (Pascal VOC, CVAT)
  • Polygon simplification options for more efficient annotations
  • Multi-GPU support for faster batch processing

image

This commit adds functionality to export segmentation masks and bounding boxes in standard annotation formats:

Added COCO JSON export with proper polygon segmentation
Added YOLO bounding box format export
Added YOLO segmentation format export (compatible with YOLOv8-seg)
Implemented mask-to-polygon conversion using OpenCV contour detection
Added individual mask image export for visualization
Modified the Gradio interface to include annotation format selection
Improved error handling in the server API
Switched to direct model access for more reliable predictions
Created automatic directory structure for organized annotation storage
This commit adds several improvements to the batch image segmentation functionality:

Organized Output Structure:
Created a structured output with three dedicated subfolders: images, bounding_boxes, and masks
Implemented consistent file naming across all three folders for better traceability
Added single classes.txt file generation for the entire batch instead of per-image
Optimized Mask Handling:
Disabled individual mask image saving to reduce disk usage and processing time
Modified mask handling to properly convert list masks to numpy arrays
Added error handling for empty masks
UI and Progress Improvements:
Added real-time progress tracking with percentage completion
Implemented detailed progress messages showing current image and count
Fixed model selection dropdown to use correct SAM model names
Added completion indicators with checkmarks
Bug Fixes:
Fixed image counting to avoid double-counting files with different case extensions
Improved error handling to continue processing when individual images fail
Fixed compatibility issues with older Python versions
Ensured consistent annotation file paths across COCO and YOLO formats
@luca-medeiros
Copy link
Copy Markdown
Owner

Hey @fyzanahammad, appreciate the PR!
While I find these utilities cool and the new demo/new endpoint, I feel we should move some stuff from app.py.
I propose:

  • moving the mask handling, folder handling from app.py to another file
  • remove win_requirements -> it brings redundancy that might confuse people.
  • add some docs to readme about the new endpoint.
  • update the screenshot assets for the new UI.

What do you think? Let me know if you need help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants