UI Element–Aware Data Masking (PoC)

Summary

This PoC demonstrates a UI-semantic–aware data masking system for UI screenshots. It detects UI elements using a YOLO model, extracts text using EasyOCR + Tesseract, and masks sensitive information based on user intent.

Supported masking:

Table Column → masks all values under a column name
Text Field → masks only the field value
Label Text → masks text appearing after a label

Detect UI → OCR → match user intent → mask only the required data

How It Works (High Level)

YOLO detects UI elements (table column, text field, label)
OCR extracts text inside detected regions
User provides what to mask (column / field / label name)
Matching text regions are masked on the image

How to Run

Install Dependencies

pip install -r requirements.txt

Run Demo

Download the model weights from the drive. And put it on the assets/ folder. Then run the pipeline

python main.py \
  --model assets/best.pt \
  --image src/images/ss-1.jpeg \
  --headers "Order Number" "Supplier Description" "Order Date"

Example Input

Example-1

For the input image below, if the user wants to mask the columns Line Number, Sold To Name, Description 1, and Secondary Quality, the output will be as shown.

Input Image:

Output Image (masked):

Example-2

For the input image below, if the user wants to mask the fields Order No/Type, Item Numer, Planned Effective, and Worker Order, the output will be as shown.
Input Image: Output Image (masked):

Train YOLO Model From CLI

To train the YOLO model, run the following command. Make sure to update the --data path to point to your dataset config file. To learn coco8 dataset format, check the coco8 dataset format file.

python train.py \
  --data config/coco8.yaml \
  --epochs 100 \
  --imgsz 640 \
  --batch 8

Tech Stack

YOLO (UI element detection)
EasyOCR + Tesseract (OCR)
OpenCV
Python

Note

This is a Proof of Concept, focused on demonstrating intent-based, UI-aware masking rather than full production coverage.

Name		Name	Last commit message	Last commit date
Latest commit History 122 Commits
common		common
config		config
data_extraction		data_extraction
detection		detection
docs		docs
experiments		experiments
masking		masking
ocr		ocr
resources		resources
src		src
train		train
utils		utils
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
train_yolo.py		train_yolo.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UI Element–Aware Data Masking (PoC)

Summary

How It Works (High Level)

How to Run

Install Dependencies

Run Demo

Example Input

Example-1

Example-2

Train YOLO Model From CLI

Tech Stack

Note

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

UI Element–Aware Data Masking (PoC)

Summary

How It Works (High Level)

How to Run

Install Dependencies

Run Demo

Example Input

Example-1

Example-2

Train YOLO Model From CLI

Tech Stack

Note

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages