This is the official repository of paper "Toward real text manipulation detection: New dataset and new solution" (Pattern Recogntion, 2024).
The RTM dataset consists of 9,000 text images in total, including 6,000 manually manipulated text images and 3,000 authentic images. The dataset is available at Google Drive.
Before running the srcipt, please make sure the prediction folder is renamed following the format:
{MethodName}_mask
For example: ascformer_mask
cd EvalRTM
python run_eval.py --pred_dir ${PRED_FOLDER} --gt_dir ${RTM_GT_FOLDER}
We use pqdm to accelerate the evaluating process. The evaluation results will be saved in Json file and shown using PrettyTable.
This repo depends on This repo depends on PyTorch, MMCV, MMSegmentation. Below are quick steps for installation. Please refer to MMSegmentation Install Guide for more detailed instruction.
Python 3.8 + PyTorch 2.0.0 + CUDA 11.8 + mmsegmentation (1.0.0rc6)
conda create --name rtm python=3.8 -y
conda activate rtm
pip install torch==2.0.0 torchvision==0.15.1 torchaudio==2.0.1 --index-url https://download.pytorch.org/whl/cu118
pip install -U openmim
mim install "mmengine==0.7.0"
mim install "mmcv==2.0.0"
git clone https://github.com/DrLuo/RTM.git
cd RTM
cd ASCFormer
pip install -r requirements.txt
pip install -v -e .
Note that the working directory is ASCFormer
Place the RTM dataset at ASCFormer/data/ttd/RealTextMan
Organize the files as follows
|- ./data
|- ttd
|- RealTextMan
|- JPEGImages
|- SegmentationClass
|- train.txt
|- val.txt
└ test.txt
We leverage the tampered images in the test set for validation during training.
Make sure the working directory is under ASCFormer
cd ASCFormer
For distributed training on multiple GPUs, please use
bash ./tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM}
For training on a single GPU, please use
python ./tools/train.py ${CONFIG_FILE} ${GPU_NUM}
For example, we use this script to train the model:
bash tools/dist_train.sh configs/ascformer/ascformer_rtm.py 2
For inference on multiple GPUs, please use
bash tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${NUM_GPUS} --mask
For inference on single GPU, please use
python ${CONFIG_FILE} ${CHECKPOINT_FILE} --mask
For example, we use this script to inference and evaluate:
bash tools/dist_test.sh configs/ascformer/ascformer_rtm.py work_dirs/ascformer_rtm/ascformer_model.pth ${NUM_GPUS} --mask
After obtaining the binary masks, please use the evaluation tool of RTM for more detailed evaluation.
Method | CM | SP | GN | CV | IP | CB | Tamper | All | download |
---|---|---|---|---|---|---|---|---|---|
ASC-Former | 18.57 | 32.79 | 18.89 | 16.06 | 27.63 | 19.35 | 21.57 | 19.71 | model |
Please cite the following paper when using the RTM dataset or this repo.
@article{luo2024toward,
title={Toward real text manipulation detection: New dataset and new solution},
author={Luo, Dongliang and Liu, Yuliang and Yang, Rui and Liu, Xianjin and Zeng, Jishen and Zhou, Yu and Bai, Xiang},
journal={Pattern Recognition},
pages={110828},
year={2024},
publisher={Elsevier}
}
This repo is based on MMSegmentation 1.0.0rc6. We appreciate this wonderful open-source toolbox.