Skip to content

Commit 3545661

Browse files
committed
initial code commit
1 parent 7e09625 commit 3545661

33 files changed

+3855
-0
lines changed

README.md

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,85 @@
11
# Strikethrough Removal From Handwritten Words Using CycleGANs
2+
3+
[![License](https://img.shields.io/badge/License-MIT-blue.svg?style=flat-square)](https://opensource.org/licenses/MIT)
4+
5+
### [Raphaela Heil](mailto:raphaela.heil@it.uu.se):envelope:, [Ekta Vats](ekta.vats@it.uu.se) and [Anders Hast](anders.hast@it.uu.se)
6+
7+
Code and related resources for the [ICDAR 2021](https://icdar2021.org/) paper **Strikethrough Removal From Handwritten Words Using CycleGANs**
8+
9+
## Table of Contents
10+
1. [Code](#code)
11+
1. [Strikethrough Removal](#strikethrough-removal)
12+
2. [Strikethrough Classification](#strikethrough-classification)
13+
3. [Strikethrough Identification](#strikethrough-identification)
14+
4. [Running the Code](#running-the-code)
15+
2. [Data](#data)
16+
3. [Citation](#citation)
17+
4. [Acknowledgements](#acknowledgements)
18+
19+
## Code
20+
Each of the following subdirectories contains the code that was used in the context of this paper. Additionally, Python requirements and the original configuration(s) are included for each. Configuration files have to be modified with local paths to input and output directories before running.
21+
22+
Model checkpoints are attached in the release of this repository.
23+
24+
### Strikethrough Removal
25+
- code for training various forms of CycleGANs to remove strikethrough from handwritten words
26+
- the CycleGAN code is based on [https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix)
27+
> @inproceedings{CycleGAN2017,
28+
title={Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networkss},
29+
author={Zhu, Jun-Yan and Park, Taesung and Isola, Phillip and Efros, Alexei A},
30+
booktitle={Computer Vision (ICCV), 2017 IEEE International Conference on},
31+
year={2017}
32+
}
33+
34+
### Strikethrough Classification
35+
- code to train a DenseNet121 to classify a struck-through word image into one of seven types of strikethrough
36+
37+
### Strikethrough Identification
38+
- code to train a DenseNet121 to identify whether a given word image is struck-through or not (i.e. 'clean')
39+
40+
### Running the Code
41+
42+
#### Train
43+
In order to train any of the three models, run:
44+
```
45+
python src/train.py -configfile <path to config file> -config <name of section from config file>
46+
```
47+
48+
If no `configfile` is defined, the script will assume `config.cfg` in the current working directory. If no `config` is defined, the script will assume `DEFAULT`.
49+
50+
#### Test
51+
For testing, run:
52+
```
53+
python src/train.py -configfile <path to config file> -data <path to data dir>
54+
```
55+
- `configfile` should point to the config file in an output directory of a train run (or one of the checkpoint config files)
56+
- `data` should point to a directory containing `struck` and `struck_gt` sub-directories, e.g. one of the datasets presented in [Data](#data)
57+
- an additional flag `-save` can be specified to save the cleaned images, otherwise only performance metrics (F<sub>1</sub> score and RMSE) will be logged
58+
59+
60+
## Data
61+
- Synthetic strikethrough dataset on Zenodo: [https://doi.org/10.5281/zenodo.4767094](https://doi.org/10.5281/zenodo.4767094)
62+
- based on the [IAM](https://fki.tic.heia-fr.ch/databases/iam-handwriting-database) database
63+
- multi-writer
64+
- generated using [https://doi.org/10.5281/zenodo.4767062](https://doi.org/10.5281/zenodo.4767062)
65+
- Genuine strikethrough dataset on Zenodo: [https://doi.org/10.5281/zenodo.4765062](https://doi.org/10.5281/zenodo.4765062)
66+
- single-writer
67+
- blue ballpoint pen
68+
- clean and struck word images registered based on:
69+
>J. Öfverstedt, J. Lindblad and N. Sladoje, "Fast and Robust Symmetric Image Registration Based on Distances Combining Intensity and Spatial Information," in IEEE Transactions on Image Processing, vol. 28, no. 7, pp. 3584-3597, July 2019, doi: 10.1109/TIP.2019.2899947.
70+
([Paper](https://ieeexplore.ieee.org/document/8643403), [Code](https://github.com/MIDA-group/py_alpha_amd_release))
71+
72+
## Citation
73+
ICDAR 2021
74+
```
75+
@INPROCEEDINGS{heil2021strikethrough,
76+
author={Heil, Raphaela and Vats, Ekta and Hast, Anders},
77+
booktitle={2021 International Conference on Document Analysis and Recognition (ICDAR)},
78+
title={{Strikethrough Removal from Handwritten Words Using CycleGANs}},
79+
year={2021},
80+
pubstate={to appear}}
81+
```
82+
83+
## Acknowledgements
84+
- R.Heil would like to thank [Nicolas Pielawski](https://scholar.google.se/citations?user=MmqXB5oAAAAJ), [Håkan Wieslander](https://scholar.google.se/citations?user=PLJ8O9MAAAAJ), [Johan Öfverstedt](https://scholar.google.se/citations?user=GMminVMAAAAJ) and [Anders Brun](https://scholar.google.se/citations?user=LQ4p1qQAAAAJ) for their helpful comments and fruitful discussions.
85+
- The computations were enabled by resources provided by the Swedish National Infrastructure for Computing ([SNIC](https://snic.se/)) at the High Performance Computing Center North ([HPC2N](https://www.hpc2n.umu.se/)) partially funded by the Swedish Research Council through grant agreement no. 2018-05973.
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
[DEFAULT]
2+
outdir = tmp
3+
trainimgagebasedir = train/struck
4+
testimagedir = validation/struck
5+
imageheight = 128
6+
imagewidth = 512
7+
epochs = 30
8+
batchsize = 128
9+
validationepochinterval = 1
10+
modelsaveepoch = -1
11+
invertimages = True
12+
model = dense
13+
padscale = False
14+
padwidth = 1024
15+
padheight = 256
16+
17+
[PAD_INV]
18+
model = dense
19+
batchsize = 64
20+
invertimages = True
21+
padscale = True
22+
padwidth = 1024
23+
padheight = 256
24+
Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
# This file may be used to create an environment using:
2+
# $ conda create --name <env> --file <this file>
3+
# platform: linux-64
4+
_libgcc_mutex=0.1=main
5+
blas=1.0=mkl
6+
blosc=1.20.1=hd408876_0
7+
brotli=1.0.9=he6710b0_2
8+
brunsli=0.1=h2531618_0
9+
bzip2=1.0.8=h7b6447c_0
10+
ca-certificates=2021.1.19=h06a4308_0
11+
certifi=2020.12.5=py38h06a4308_0
12+
charls=2.1.0=he6710b0_2
13+
cloudpickle=1.6.0=py_0
14+
cudatoolkit=10.2.89=hfd86e86_1
15+
cycler=0.10.0=py38_0
16+
cytoolz=0.11.0=py38h7b6447c_0
17+
dask-core=2021.1.1=pyhd3eb1b0_0
18+
dbus=1.13.18=hb2f20db_0
19+
decorator=4.4.2=pyhd3eb1b0_0
20+
expat=2.2.10=he6710b0_2
21+
fontconfig=2.13.0=h9420a91_0
22+
freetype=2.10.4=h5ab3b9f_0
23+
giflib=5.1.4=h14c3975_1
24+
glib=2.66.1=h92f7085_0
25+
gst-plugins-base=1.14.0=h8213a91_2
26+
gstreamer=1.14.0=h28cd5cc_2
27+
icu=58.2=he6710b0_3
28+
imagecodecs=2021.1.11=py38h581e88b_1
29+
imageio=2.9.0=py_0
30+
intel-openmp=2020.2=254
31+
joblib=1.0.0=pyhd3eb1b0_0
32+
jpeg=9b=h024ee3a_2
33+
jxrlib=1.1=h7b6447c_2
34+
kiwisolver=1.3.0=py38h2531618_0
35+
lcms2=2.11=h396b838_0
36+
ld_impl_linux-64=2.33.1=h53a641e_7
37+
lerc=2.2.1=h2531618_0
38+
libaec=1.0.4=he6710b0_1
39+
libdeflate=1.7=h27cfd23_5
40+
libedit=3.1.20191231=h14c3975_1
41+
libffi=3.3=he6710b0_2
42+
libgcc-ng=9.1.0=hdf63c60_0
43+
libgfortran-ng=7.3.0=hdf63c60_0
44+
libpng=1.6.37=hbc83047_0
45+
libstdcxx-ng=9.1.0=hdf63c60_0
46+
libtiff=4.1.0=h2733197_1
47+
libuuid=1.0.3=h1bed415_2
48+
libuv=1.40.0=h7b6447c_0
49+
libwebp=1.0.1=h8e7db2f_0
50+
libxcb=1.14=h7b6447c_0
51+
libxml2=2.9.10=hb55368b_3
52+
libzopfli=1.0.3=he6710b0_0
53+
lz4-c=1.9.3=h2531618_0
54+
matplotlib=3.3.2=h06a4308_0
55+
matplotlib-base=3.3.2=py38h817c723_0
56+
mkl=2020.2=256
57+
mkl-service=2.3.0=py38he904b0f_0
58+
mkl_fft=1.2.0=py38h23d657b_0
59+
mkl_random=1.1.1=py38h0573a6f_0
60+
ncurses=6.2=he6710b0_1
61+
networkx=2.5=py_0
62+
ninja=1.10.2=py38hff7bd54_0
63+
numpy=1.19.2=py38h54aff64_0
64+
numpy-base=1.19.2=py38hfa32c7d_0
65+
olefile=0.46=py_0
66+
openjpeg=2.3.0=h05c96fa_1
67+
openssl=1.1.1i=h27cfd23_0
68+
pandas=1.2.1=py38ha9443f7_0
69+
pcre=8.44=he6710b0_0
70+
pillow=8.1.0=py38he98fc37_0
71+
pip=20.3.3=py38h06a4308_0
72+
pyparsing=2.4.7=pyhd3eb1b0_0
73+
pyqt=5.9.2=py38h05f1152_4
74+
python=3.8.5=h7579374_1
75+
python-dateutil=2.8.1=pyhd3eb1b0_0
76+
pytorch=1.7.1=py3.8_cuda10.2.89_cudnn7.6.5_0
77+
pytz=2021.1=pyhd3eb1b0_0
78+
pywavelets=1.1.1=py38h7b6447c_2
79+
pyyaml=5.4.1=py38h27cfd23_1
80+
qt=5.9.7=h5867ecd_1
81+
readline=8.1=h27cfd23_0
82+
scikit-image=0.17.2=py38hdf5156a_0
83+
scikit-learn=0.23.2=py38h0573a6f_0
84+
scipy=1.5.2=py38h0b6359f_0
85+
setproctitle=1.2.2=py38h27cfd23_1004
86+
setuptools=52.0.0=py38h06a4308_0
87+
sip=4.19.13=py38he6710b0_0
88+
six=1.15.0=py38h06a4308_0
89+
snappy=1.1.8=he6710b0_0
90+
sqlite=3.33.0=h62c20be_0
91+
threadpoolctl=2.1.0=pyh5ca1d4c_0
92+
tifffile=2021.1.14=pyhd3eb1b0_1
93+
tk=8.6.10=hbc83047_0
94+
toolz=0.11.1=pyhd3eb1b0_0
95+
torchvision=0.8.2=py38_cu102
96+
tornado=6.1=py38h27cfd23_0
97+
typing_extensions=3.7.4.3=pyh06a4308_0
98+
wheel=0.36.2=pyhd3eb1b0_0
99+
xz=5.2.5=h7b6447c_0
100+
yaml=0.2.5=h7b6447c_0
101+
zfp=0.5.5=h2531618_4
102+
zlib=1.2.11=h7b6447c_3
103+
zstd=1.4.5=h9ceee32_0
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
from .configuration import ModelName, Configuration, getConfiguration
2+
from .dataset import StrikeThroughType, StruckDataset
3+
from .utils import PadToSize, composeTransformations, getModelByName
4+
5+
__all__ = ["ModelName", "Configuration", "getConfiguration", "StrikeThroughType", "StruckDataset", "PadToSize",
6+
"composeTransformations", "getModelByName"]
Lines changed: 134 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,134 @@
1+
"""
2+
Contains all code related to the configuration of experiments.
3+
"""
4+
5+
import argparse
6+
import random
7+
import time
8+
from configparser import SectionProxy, ConfigParser
9+
from enum import Enum, auto
10+
from pathlib import Path
11+
from typing import Tuple
12+
13+
import torch
14+
15+
16+
class ModelName(Enum):
17+
"""
18+
Encodes the names of supported models.
19+
"""
20+
DENSE = auto()
21+
RESNET = auto()
22+
23+
@staticmethod
24+
def getByName(name: str) -> "ModelName":
25+
"""
26+
Returns the ModelName corresponding to the given string. Returns ModelName.RESNET in case an unknown name is
27+
provided.
28+
29+
Parameters
30+
----------
31+
name : str
32+
string representation that should be converted to a ModelName
33+
34+
Returns
35+
-------
36+
ModelName representation of the provided string, default: ModelName.RESNET
37+
"""
38+
if name.upper() in [model.name for model in ModelName]:
39+
return ModelName[name.upper()]
40+
else:
41+
return ModelName.RESNET
42+
43+
44+
class Configuration:
45+
"""
46+
Holds the configuration for the current experiment.
47+
"""
48+
49+
def __init__(self, parsedConfig: SectionProxy, test: bool = False, fileSection: str = "DEFAULT"):
50+
self.fileSection = fileSection
51+
self.outDir = Path(parsedConfig.get('outdir')) / '{}_{}_{}'.format(fileSection, str(int(time.time())),
52+
random.randint(0, 100000))
53+
if not self.outDir.exists() and not test:
54+
self.outDir.mkdir(parents=True, exist_ok=True)
55+
if torch.cuda.is_available():
56+
self.device = 'cuda'
57+
else:
58+
self.device = 'cpu'
59+
60+
self.epochs = parsedConfig.getint('epochs', 100)
61+
self.learningRate = parsedConfig.getfloat('learning_rate', 0.0002)
62+
self.betas = self.parseBetas(parsedConfig.get("betas", "0.5,0.999"))
63+
64+
self.batchSize = parsedConfig.getint('batchsize', 4)
65+
self.imageHeight = parsedConfig.getint('imageheight', 128)
66+
self.imageWidth = parsedConfig.getint('imagewidth', 256)
67+
self.modelSaveEpoch = parsedConfig.getint('modelsaveepoch', 10)
68+
self.validationEpoch = parsedConfig.getint('validationEpochInterval', 10)
69+
self.trainImageDir = Path(parsedConfig.get('trainimgagebasedir'))
70+
self.testImageDir = Path(parsedConfig.get('testimagedir'))
71+
self.invertImages = parsedConfig.getboolean('invertImages', False)
72+
self.padScale = parsedConfig.getboolean('padscale', False)
73+
self.padWidth = parsedConfig.getint('padwidth', 512)
74+
self.padHeight = parsedConfig.getint('padheight', 256)
75+
76+
self.modelName = ModelName.getByName(parsedConfig.get("model", "RESNET"))
77+
78+
if not test:
79+
configOut = self.outDir / 'config.cfg'
80+
with configOut.open('w+') as cfile:
81+
parsedConfig.parser.write(cfile)
82+
83+
@staticmethod
84+
def parseBetas(betaString: str) -> Tuple[float, float]:
85+
"""
86+
Parses a comma-separated string to a list of floats.
87+
88+
Parameters
89+
----------
90+
betaString: str
91+
String to be parsed.
92+
93+
Returns
94+
-------
95+
Tuple of floats.
96+
97+
Raises
98+
------
99+
ValueError
100+
if fewer than two values are specified
101+
"""
102+
betas = betaString.split(',')
103+
if len(betas) < 2:
104+
raise ValueError("found fewer than two values for betas")
105+
return float(betas[0]), float(betas[1])
106+
107+
108+
def getConfiguration() -> Configuration:
109+
"""
110+
Reads the required arguments from command line and parse the respective configuration file/section.
111+
112+
Returns
113+
-------
114+
parsed :class:`Configuration`
115+
"""
116+
cmdParser = argparse.ArgumentParser()
117+
cmdParser.add_argument("-config", required=False, help="section of config-file to use")
118+
cmdParser.add_argument("-configfile", required=False, help="path to config-file")
119+
args = vars(cmdParser.parse_args())
120+
fileSection = 'DEFAULT'
121+
fileName = 'config.cfg'
122+
if args["config"]:
123+
fileSection = args["config"]
124+
125+
if args['configfile']:
126+
fileName = args['configfile']
127+
configParser = ConfigParser()
128+
configParser.read(fileName)
129+
parsedConfig = configParser[fileSection]
130+
sections = configParser.sections()
131+
for s in sections:
132+
if s != fileSection:
133+
configParser.remove_section(s)
134+
return Configuration(parsedConfig, fileSection=fileSection)

0 commit comments

Comments
 (0)