-
Notifications
You must be signed in to change notification settings - Fork 38
Квантизация PaddlePaddle #575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 21 commits
c2e42df
099db8b
f58ab59
0783f1f
97257c8
2747e44
7287277
b3274d7
fcc3ca9
583f819
db7ef6b
b17d34d
832cb03
eadc0e9
46a182f
03e899f
b7e97aa
57dc0ef
9b57d0c
084376d
d30cc0e
1ba0da8
8a23e2b
78fe834
231fdf1
9eb7c71
47f7040
ba63997
f8c7b57
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
<?xml version="1.0" encoding="utf-8"?> | ||
<QuantizationConfigs> | ||
<Config> | ||
<Model> | ||
<Name></Name> | ||
<PathPrefix></PathPrefix> | ||
<ModelDir></ModelDir> | ||
<ModelFileName></ModelFileName> | ||
<ParamsFileName></ParamsFileName> | ||
</Model> | ||
<Dataset> | ||
<Name></Name> | ||
<Path></Path> | ||
<Mean></Mean> | ||
<Std></Std> | ||
<BatchSize></BatchSize> | ||
<BatchNum></BatchNum> | ||
<ResizeResolution></ResizeResolution> | ||
</Dataset> | ||
<QuantizationParameters> | ||
<InputShape></InputShape> | ||
<InputName></InputName> | ||
<SaveDir></SaveDir> | ||
</QuantizationParameters> | ||
</Config> | ||
</QuantizationConfigs> |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
# PaddlePaddle quantization script | ||
|
||
Script name: | ||
|
||
```bash | ||
quantization_paddlepaddle.py | ||
``` | ||
|
||
Required arguments: | ||
|
||
- `-c / --config` is a path to the file containing information | ||
about quantization process in the xml-format. Template of the configuration file | ||
located [here][config_path]. | ||
|
||
Description of parameters: | ||
|
||
`Model` contains information about model to be quantized: | ||
- `Name` is a name of the model. | ||
- `PathPrefix` is a path to the model files without the extensions (.pdmodel, .pdiparams). | ||
- `ModelDir` is a directory with the model. | ||
- `ModelFileName` is a file name of the model description. | ||
- `ParamsFileName` is a file name of the model parameters. | ||
|
||
`Dataset` contains information about dataset for the model calibration: | ||
- `Name` is a dataset name. | ||
- `Path` is a path to the directory that contains input data. | ||
- `Mean` is a mean value for preprocessing data. | ||
- `Std` is a scale value for preprocessing data. | ||
- `ResizeResolution` is an image size for preprocessing data. Example: 224, 224. | ||
- `BatchSize` is a batch size. | ||
- `BatchNum` is the total number of batches | ||
|
||
`QuantizationParameters` contains information about the model input layer: | ||
- `InputShape` is a shape of the model's input layer. | ||
- `InputName` is a name of the model's input layer. | ||
- `SaveDir` is a directory for the quantized model to be saved. | ||
|
||
|
||
<!-- LINKS --> | ||
[config_path]: ../../configs/paddle_quantization_config_template.xml |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,138 @@ | ||
import sys | ||
from pathlib import Path | ||
import random | ||
import numpy as np | ||
import paddle | ||
from paddle.io import Dataset | ||
import ast | ||
from paddle.io import DataLoader | ||
from paddleslim.quant import quant_post_static | ||
import importlib | ||
sys.path.append(str(Path(__file__).resolve().parents[1])) | ||
from utils import ArgumentsParser # noqa: E402 | ||
|
||
random.seed(0) | ||
np.random.seed(0) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. В связи с чем фиксировался seed? Чтобы shuffle перемешивал датасет при каждом запуске одинаково? Если да, то нужно это убрать |
||
|
||
default_crop_scale = (0.08, 1.0) | ||
default_crop_ratio = (3. / 4., 4. / 3.) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Не вижу, где эти переменные применяются |
||
|
||
|
||
class PaddleDatasetReader(Dataset): | ||
def __init__(self, args, log): | ||
super(PaddleDatasetReader, self).__init__() | ||
self.log = log | ||
self.log.info('Parsing dataset arguments.') | ||
self.cv2 = importlib.import_module('cv2') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Зачем надо импортировать динамически? Почему нельзя просто сделать |
||
self.data_dir = args['Path'] | ||
|
||
self.resize_size = ast.literal_eval(args['ResizeResolution']) | ||
self.mean = np.array((np.asarray(ast.literal_eval(args['Mean']), dtype=np.float32) | ||
if args['Mean'] is not None else [0., 0., 0.])).reshape((3, 1, 1)) | ||
self.std = np.array((np.asarray(ast.literal_eval(args['Std']), dtype=np.float32) | ||
if args['Std'] is not None else [1., 1., 1.])).reshape((3, 1, 1)) | ||
|
||
self.batch_size = int(args['BatchSize']) | ||
self.batch_num = int(args['BatchNum']) | ||
self.dataset = list(Path(self.data_dir).glob('*')) | ||
random.shuffle(self.dataset) | ||
self.dataset_iter = iter(self.dataset) | ||
|
||
def __getitem__(self, index): | ||
image_path = str(self.dataset[index].absolute()) | ||
data = self.process_image(image_path) | ||
return data | ||
|
||
def __len__(self): | ||
return len(self.dataset) | ||
|
||
def process_image(self, image_path): | ||
|
||
img = self.cv2.imread(image_path) | ||
if img.size == 0: | ||
self.log.info('failed to read:', image_path) | ||
return None | ||
img = self.cv2.resize(img, self.resize_size) | ||
|
||
img = img.astype('float32').transpose((2, 0, 1)) / 255 | ||
img -= self.mean | ||
img /= self.std | ||
|
||
return img | ||
|
||
|
||
class PaddleQuantizationProcess: | ||
def __init__(self, log, model_reader, dataset, quant_params): | ||
self.log = log | ||
self.model_reader = model_reader | ||
self.dataset = dataset | ||
self.quant_params = quant_params | ||
|
||
def transform_fn(self): | ||
for data in self.dataset: | ||
yield [data.astype(np.float32)] | ||
|
||
def quantization_paddle(self): | ||
place = paddle.CPUPlace() | ||
exe = paddle.static.Executor(place) | ||
data_loader = DataLoader( | ||
self.dataset, | ||
places=place, | ||
feed_list=[self.quant_params.image], | ||
drop_last=False, | ||
return_list=False, | ||
batch_size=self.dataset.batch_size, | ||
shuffle=False) | ||
|
||
quant_post_static( | ||
executor=exe, | ||
model_dir=self.model_reader.model_dir, | ||
quantize_model_path=self.quant_params.save_dir, | ||
data_loader=data_loader, | ||
model_filename=self.model_reader.model_filename, | ||
params_filename=self.model_reader.params_filename, | ||
batch_size=self.dataset.batch_size, | ||
batch_nums=self.dataset.batch_num, | ||
algo='avg', | ||
round_type='round', | ||
hist_percent=0.9999, | ||
is_full_quantize=False, | ||
bias_correction=False, | ||
onnx_format=False) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Эти параметры варьируются в |
||
|
||
|
||
class PaddleModelReader(ArgumentsParser): | ||
def __init__(self, log): | ||
super().__init__(log) | ||
|
||
def _get_arguments(self): | ||
self._log.info('Parsing model arguments.') | ||
self.model_name = self.args['Name'] | ||
self.path_prefix = self.args['PathPrefix'] | ||
self.model_dir = self.args['ModelDir'] | ||
self.model_filename = self.args['ModelFileName'] | ||
self.params_filename = self.args['ParamsFileName'] | ||
|
||
def dict_for_iter_log(self): | ||
return { | ||
'Name': self.model_name, | ||
'Model path prefix': self.path_prefix, | ||
} | ||
|
||
|
||
class PaddleQuantParamReader(ArgumentsParser): | ||
def __init__(self, log): | ||
super().__init__(log) | ||
|
||
def dict_for_iter_log(self): | ||
return { | ||
'InputShape': self.input_shape, | ||
'InputName': self.input_name, | ||
'SaveDir': self.save_dir, | ||
} | ||
|
||
def _get_arguments(self): | ||
self.input_shape = ast.literal_eval(self.args['InputShape']) | ||
self.image = paddle.static.data(name=self.args['InputName'], shape=[None] + self.input_shape, dtype='float32') | ||
self.input_name = self.args['InputName'] | ||
self.save_dir = self.args['SaveDir'] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
import paddle | ||
import argparse | ||
import sys | ||
import traceback | ||
from pathlib import Path | ||
from parameters import PaddleModelReader, PaddleDatasetReader, PaddleQuantizationProcess, PaddleQuantParamReader | ||
sys.path.append(str(Path(__file__).resolve().parents[3])) | ||
from src.utils.logger_conf import configure_logger # noqa: E402 | ||
from src.quantization.utils import ConfigParser # noqa: E402 | ||
|
||
|
||
log = configure_logger() | ||
|
||
|
||
def cli_argument_parser(): | ||
parser = argparse.ArgumentParser() | ||
parser.add_argument('-c', '--config', | ||
help='Path to the configuration file in the xml-format.', | ||
type=str, | ||
required=True, | ||
dest='config') | ||
args = parser.parse_args() | ||
return args | ||
|
||
|
||
def main(): | ||
args = cli_argument_parser() | ||
try: | ||
log.info(f'Parsing the configuration file {args.config}') | ||
parser = ConfigParser(args.config) | ||
paddle.enable_static() | ||
config = parser.parse() | ||
exit_code = 0 | ||
quant_params = PaddleQuantParamReader(log) | ||
model_reader = PaddleModelReader(log) | ||
for model_quant_config in config: | ||
valentina-kustikova marked this conversation as resolved.
Show resolved
Hide resolved
|
||
try: | ||
data_reader = PaddleDatasetReader(model_quant_config[1]['Dataset'], log) | ||
model_reader.add_arguments(model_quant_config[0]['Model']) | ||
quant_params.add_arguments(model_quant_config[2]['QuantizationParameters']) | ||
proc = PaddleQuantizationProcess(log, model_reader, data_reader, quant_params) | ||
proc.quantization_paddle() | ||
|
||
except Exception: | ||
log.error(traceback.format_exc()) | ||
exit_code += 1 | ||
if exit_code: | ||
sys.exit(1) | ||
except Exception: | ||
log.error(traceback.format_exc()) | ||
sys.exit(1) | ||
|
||
|
||
if __name__ == '__main__': | ||
sys.exit(main() or 0) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
<?xml version="1.0" encoding="utf-8"?> | ||
<QuantizationConfigs> | ||
<Config> | ||
<Model> | ||
<Name>resnet50-paddle</Name> | ||
<PathPrefix>../models_dir/resnet50_paddle/inference</PathPrefix> | ||
<ModelDir>../models_dir/resnet50_paddle</ModelDir> | ||
<ModelFileName>resnet50.pdmodel</ModelFileName> | ||
<ParamsFileName>resnet50.pdiparams</ParamsFileName> | ||
</Model> | ||
<Dataset> | ||
<Name>test</Name> | ||
<Path>../test_images/classification_images</Path> | ||
<Mean>[123.675, 116.28, 103.53]</Mean> | ||
<Std>[58.395, 57.12, 57.375]</Std> | ||
valentina-kustikova marked this conversation as resolved.
Show resolved
Hide resolved
|
||
<BatchSize>1</BatchSize> | ||
<BatchNum>10</BatchNum> | ||
<ResizeResolution>[224, 224]</ResizeResolution> | ||
</Dataset> | ||
<QuantizationParameters> | ||
<InputShape>[3, 224, 224]</InputShape> | ||
<InputName>inputs</InputName> | ||
<SaveDir>res_dir</SaveDir> | ||
</QuantizationParameters> | ||
</Config> | ||
</QuantizationConfigs> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Лучше зафиксировать версию рабочую, потому что при обновлении что-то может поломаться