This project improves upon the original Water Meter Monitor by taking advantage of several Deep Learning Neural Networks. The original project demands the position and perspective of the odometer to be fixed in every snapshot as the x-axis and y-axis are manually plotted at the beginning. This issue is now resolved by training HRNet to localize three key points of the odometer and perform perspective correction. The latest approach also uses YOLACT to extract the target mask through instance segmentation instead of HSV color filtering to enhance the accuracy of the pipeline.
The pipeline consists of
- Key Point Localization through HRNet
- Bounding Box Estimation and Perspective Correction
- Instance Segmentation through YOLACT
- Finding the Eigenvectors & Principle Axis
The three key points are defined as the three screws on the odometer. The gaussian ground truth heatmaps shown above are generated as per methods described on the official HRNet Github Page. MSE Loss is used to sum the differences between the heatmaps produced by the model and the ground truths, and is minimized to optimize the model parameters.
A rectangular bounding box is constructed by estimating the likely intersections between the three located key points. Then, the four vertices of the bounding box are used to perform OpenCV's WarpPerspective to crop and correct the perspective of the odometer.
YOLACT is used to perform instance segmentation on the warped image, which extracts the mask of the red pointer. The previous method was to use HSV color filtering, but that approach proved unreliable when lighting conditions are not ideal. The training dataset is annotated through Labelme, and is converted into COCO styled annotations for YOLACT.
Once we obtain the mask of the pointer, we determine the eigenvector and the principle axis to find the direction of the arm.
More results can be seen here.
The following are links to the trained models and datasets used in this project: