Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
190 commits
Select commit Hold shift + click to select a range
e5efe98
完成代码流程,但计算结果还需要校正
Jun 2, 2017
9a6a17c
修正 cuMask 计算结果
Jun 2, 2017
3345026
调整cu代码结构
Jun 2, 2017
5d49f24
调整代码
Jun 2, 2017
b4d0ffe
简化代码
Jun 2, 2017
18f9672
调整cu编译
Jun 3, 2017
601e367
CUDA编译支持宏开关
Jun 3, 2017
c0bab47
优化clSetKernelArg代码
Jun 4, 2017
39bcbd1
精简代码
Jun 4, 2017
1cb6e52
cu编译改回nvcc提前编译
Jun 4, 2017
cd2e614
更换mode方式
Jun 4, 2017
598603b
异步拷贝内存
Jun 4, 2017
8c29f1f
完成CUDA并行优化,计算结果正常
Jun 5, 2017
d13a9ba
修正命令行提示,Max Thread Per MP和SP是不一样的概念
Jun 5, 2017
f9ba50e
调整参数试试性能情况
Jun 5, 2017
cce5bc3
修正64、32位判断的宏
Jun 6, 2017
3237a50
优化
Jun 6, 2017
9f8597d
恢复factor=2的支持,性能差别不大,但是编译时间变长了
Jun 6, 2017
61fde3c
优化编译和Test脚本
Jun 6, 2017
8fe8454
减少kernel中一些冗余的数据copy
Jun 6, 2017
c90b88a
Merge branch 'master' of https://github.com/ianhuang-777/guetzli
Jun 6, 2017
1e4b4f4
优化clDiffmapOpsinDynamicsImageEx
Jun 6, 2017
3995006
增加一些调试信息
Jun 6, 2017
1aa86d5
kernel运算用float替代double,节省运算时间
Jun 7, 2017
8ed0ce3
修正数组长度
Jun 7, 2017
7aff164
我也不知道为什么,删除掉这个空行计算结果就正确了
Jun 7, 2017
f795ad1
修正编译配置
Jun 7, 2017
13abc16
修正warning
Jun 7, 2017
0c85b8f
换一组编译参数
Jun 7, 2017
45300b4
Merge branch 'googleMaster'
Jun 7, 2017
a9ebb86
fix build
Apr 21, 2017
9ff693c
add sample picture
Apr 25, 2017
fb1032b
Merge branch 'master' of https://github.com/ianhuang-777/guetzli
Apr 26, 2017
e89cdcf
float is enough
Apr 26, 2017
fe645a9
Add OpenCL Support
Apr 27, 2017
c72cece
MinSquareVal with OpenCL
Apr 27, 2017
c354348
OpenCL 优化卷积
Apr 27, 2017
5d8ba53
fix setupopencl
Apr 28, 2017
82265a6
Merge branch 'master' of https://github.com/ianhuang-777/guetzli
May 2, 2017
775c63c
Add comment for understanding.
ianhuang-777 May 2, 2017
4061ccb
尝试看一下全OpenCL化Blur函数,不过目前计算误差有些大,是否有Bug?
May 3, 2017
d9a87af
add opencl process line
May 4, 2017
b618843
Merge branch 'master' of https://github.com/ianhuang-777/guetzli
May 4, 2017
0ba6817
add function
May 4, 2017
d4c9ed9
搭建 clDiffmapOpsinDynamicsImage 的计算流程
May 4, 2017
dba4c85
Convert OpsinDynamicsImage to opencl
ianhuang-777 May 4, 2017
ac4254e
fix opencl compile error
May 4, 2017
0afb0a3
open cl compiler error fix
ianhuang-777 May 4, 2017
2cbc518
Implement clConvolutionEx
ianhuang-777 May 4, 2017
fad11fc
Remove useless code
ianhuang-777 May 4, 2017
b501393
Implement clUpsampleEx
ianhuang-777 May 4, 2017
8909cda
Implement clMinSquareValEx
ianhuang-777 May 4, 2017
5ea138c
Implement clMaskEx
ianhuang-777 May 4, 2017
437fa09
Implement clScaleImageEx
ianhuang-777 May 4, 2017
a31adf1
验证clOpinDynamicImage的效果
May 4, 2017
c30b44d
尝试双精度运算支持
May 5, 2017
2e4cf39
Print More DeviceInfo
May 5, 2017
fd520d3
Merge branch 'master' of https://github.com/ianhuang-777/guetzli
May 5, 2017
56ac179
Implement clMaskHighIntensityChangeEx
ianhuang-777 May 5, 2017
a024ec1
Implement clDiffPrecomputeEx
ianhuang-777 May 5, 2017
13637b2
Implement clDiffPrecomputeEx
ianhuang-777 May 5, 2017
b7b19ed
Merge branch 'master' of https://github.com/ianhuang-777/guetzli
May 6, 2017
4aeec41
test for clDiffmapOpsinDynamicsImage
May 6, 2017
da654cb
fix runtime bug
May 6, 2017
2ceb635
remove useless code
May 6, 2017
6981d9f
添加测试用例
May 6, 2017
7e1ad82
增加测试用例
May 7, 2017
7ef1b6d
修改测试用例框架
May 7, 2017
8d35692
测试用例分工
May 7, 2017
5864a11
MapBuffer之后要进行Unmap
May 7, 2017
8474de0
先排查>100*100的计算精度问题
May 7, 2017
1e8972f
Remove _constant for opencl 1.2
ianhuang-777 May 7, 2017
9400c21
Remove _constant for opencl 2.0
ianhuang-777 May 8, 2017
6962f20
修复nVidia显卡的问题
May 8, 2017
c1f83bb
fixed n卡 __constant的问题
May 8, 2017
a8aba9b
Fix __constant error for nvidia device
ianhuang-777 May 8, 2017
d9e3808
Optimize clDoMask
ianhuang-777 May 8, 2017
7731427
32位平台编译配置
May 8, 2017
3116d6a
Move some local constant array to __constant
ianhuang-777 May 8, 2017
95f10c7
for test
May 8, 2017
1a8fcc2
测试卷积函数,节省一块中间缓存的使用
May 9, 2017
e919c9b
Merge branch 'master' of https://github.com/ianhuang-777/guetzli
May 9, 2017
f947da9
修正blockDiffMap计算
May 9, 2017
6ba5810
Merge remote-tracking branch 'origin/master'
crazyks May 10, 2017
853222f
add clMinSquareVal test
crazyks May 10, 2017
920de33
Merge branch 'master' of https://github.com/ianhuang-777/guetzli
May 10, 2017
6ce7175
修正OpsinDynamicsImage运算结果
May 10, 2017
44df712
remove redundant parameter
crazyks May 11, 2017
79cb8cd
Merge branch 'master' of https://github.com/ianhuang-777/guetzli
ianhuang-777 May 11, 2017
7b9cf14
Add tclAverage55
ianhuang-777 May 11, 2017
81c4354
修正计算结果+增加comparator子类
May 12, 2017
389777f
fix-mapbuffer长度和需要的不符
May 12, 2017
aaddc93
添加 clButteraugliComparator,避免对第三方库代码破坏太大
May 12, 2017
36905d7
规范kernel函数名以cl开头
May 12, 2017
7c97e95
修正n卡上的编译问题
May 13, 2017
5eb14f3
Merge branch 'master' of https://github.com/ianhuang-777/guetzli
May 13, 2017
f12e272
增加 SelectFrequencyMaskingBatch 化处理
May 13, 2017
dae1673
建立cl端的批量化ComputeZeroingOrder都有哦
May 14, 2017
148927e
调整分工
May 15, 2017
55f60a4
分配工作
May 15, 2017
16e27ab
clComputeBlockZeroingOrder
May 15, 2017
87b462a
修正n卡编译兼容问题
May 15, 2017
0925579
Implement part of BlurEx
ianhuang-777 May 15, 2017
b3455dd
Merge remote-tracking branch 'origin/master'
crazyks May 15, 2017
5e53802
Fix BlurEx
ianhuang-777 May 16, 2017
e69365c
fix data type of coeff_t
crazyks May 16, 2017
8d82c8e
modify MakeInputOrder
crazyks May 16, 2017
fecac92
Add BlockToImage
crazyks May 16, 2017
e2b3830
Add MaskHighIntensityChangeBlock
crazyks May 16, 2017
b76def6
SelectFrequencyMaskingBatch 计算流程修正,终于可以正常跑起来了
May 16, 2017
caa4fbb
Merge branch 'master' of https://github.com/ianhuang-777/guetzli
May 16, 2017
8c63e20
对于8x8的块,暂时不做check,否则速度太慢了
May 17, 2017
6b8bebf
Merge branch 'master' of https://github.com/ianhuang-777/guetzli
May 17, 2017
6482c67
增加访问接口,主要用于数据校验
May 18, 2017
c5a08a1
增加校验原图数据变化的代码,to be delete
May 18, 2017
d587e66
增加factor_x = factor_y = 2时的batch化原型
May 18, 2017
8d28110
翻译ComputeBlockEx2为OpenCL
May 19, 2017
08db770
clComputeBlockZeroingOrderFactor调试
May 20, 2017
d931558
精简代码
May 20, 2017
0bda30e
factor 2支持完成
May 20, 2017
999585d
合并类型声明,在opencl中include
May 20, 2017
643e8db
修正 clEnqueueUnmapMemObject 参数传递bug
May 20, 2017
b2d8639
精简代码
May 20, 2017
5a54624
清理代码
May 20, 2017
d0949f1
清理代码
May 20, 2017
cc746ff
清理代码
May 20, 2017
1f87bb2
清理代码
May 20, 2017
add8436
去掉编译事件
May 22, 2017
8f80356
精简代码
May 22, 2017
f766120
精简代码
May 22, 2017
264209c
const 控制
May 22, 2017
ea15082
Fix Average5x5
ianhuang-777 May 22, 2017
6496886
Inline ScaleIamge in kernel Average5x5
ianhuang-777 May 22, 2017
89cda39
Avoid const value computing in work item
ianhuang-777 May 22, 2017
f54bc0e
Fix tclCalculateDiffmap
ianhuang-777 May 22, 2017
36f2e52
Merge branch 'master' of https://github.com/ianhuang-777/guetzli
May 24, 2017
ec42b7b
const control
May 24, 2017
a469c02
精简代码
May 24, 2017
e68cea4
调整参数顺序
May 24, 2017
7c9c34a
调整参数规则
May 24, 2017
b0d7b80
调整参数规范
May 24, 2017
f5fcd1b
Merge branch 'master' of https://github.com/ianhuang-777/guetzli
May 25, 2017
bb1e067
调整代码,修正参数传递规则
May 25, 2017
34af91d
Merge branch 'master' of https://github.com/ianhuang-777/guetzli
May 31, 2017
b47cb8d
增加CUDA编译,请小心更新,没安装cuda会无法编译
May 31, 2017
99631b8
support cuda opt
Jun 1, 2017
ef025da
运行期编译.cu
Jun 1, 2017
a8bcf1f
兼容CUDA编译,编译器语法检查
Jun 1, 2017
4533a02
cuScaleImage跑通
Jun 2, 2017
6240ace
cuOpsinDynamicsImage 完成
Jun 2, 2017
49d74ab
增加剩余的cu入口函数
Jun 2, 2017
63ac064
简化点代码喽
Jun 2, 2017
12cd120
fix linux build
crazyks Jun 7, 2017
7c2e57d
merge google的改动之后,每次compuare StartBlockComparisons都会重新计算原始图片的opsin
Jun 7, 2017
79bce89
修复处理png时的crash
Jun 7, 2017
9e8bdb3
节省clComputeBlockZeroingOrderEx过程中的冗余计算
Jun 8, 2017
43834f7
静态库编译
Jun 8, 2017
e922dbf
编译参数
Jun 8, 2017
230924b
调整测试脚本,支持目录批量优化
Jun 9, 2017
0e0edb1
Merge branch 'master' of https://github.com/ianhuang-777/guetzli
Jun 12, 2017
c31af45
c优化选项
Jun 12, 2017
891def1
优化c代码
Jun 12, 2017
1f26bc0
不优化c了
Jun 13, 2017
742b284
优化c版本
Jun 15, 2017
b67b00d
Modify the flag for creating CUDA context
crazyks Jun 19, 2017
c1bc10c
Add macro for opencl version
crazyks Jun 20, 2017
66a8d9f
Add simple cuda memory pool
ianhuang-777 Jun 21, 2017
e11a712
Add missing files
ianhuang-777 Jun 21, 2017
36a3ce6
Clean code
ianhuang-777 Jun 21, 2017
e42fdab
Modify makefile
Jun 21, 2017
644f563
默认开启CUDA OPENCL
Jun 23, 2017
340d914
移除tcmalloc,对性能没什么影响
Jun 23, 2017
6f2726b
Change memory block status to enum
ianhuang-777 Jun 29, 2017
46367ce
Remove tcmalloc
ianhuang-777 Jul 5, 2017
8031985
支持非主流JPEG格式
zhantong Jul 7, 2017
eda913f
Mofidy makefile
Jul 9, 2017
4058d6e
修复libjpeg库在debug和32位下编译不成功的问题
zhantong Jul 10, 2017
c100839
Translate the comment.
ianhuang-777 Jul 11, 2017
5f309e7
Remove some redundant files
crazyks Jul 11, 2017
5aa73ae
Modify makefile
Jul 11, 2017
c525adf
Disable CUDA & OpenCL by default
crazyks Jul 12, 2017
ba21943
Add netpbm
crazyks Jul 13, 2017
93fd3f3
Fix type cast error on Mac
crazyks Jul 13, 2017
1c1d7e6
Update bazel version to 0.5.2
crazyks Jul 13, 2017
1cb26c7
Add oracle-java8-installer
crazyks Jul 13, 2017
40665e2
Try to fix Bazel build
crazyks Jul 13, 2017
05ee2f8
Add author information
crazyks Jul 13, 2017
808e624
Update ReadMe
crazyks Jul 13, 2017
af12f12
Update ReadMe & fix some mistakes
crazyks Jul 18, 2017
14ef86d
Update appveyor.xml
crazyks Jul 19, 2017
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,5 @@ ipch/
*.cachefile
*.VC.db
*.VC.VC.opendb
guetzli.vcxproj.user
clguetzli/clguetzli.cu.ptx*
1 change: 1 addition & 0 deletions .travis.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ case "$1" in
"bazel")
case "${TRAVIS_OS_NAME}" in
"linux")
sudo apt-get remove oracle-java9-installer
wget https://github.com/bazelbuild/bazel/releases/download/0.4.5/bazel_0.4.5-linux-x86_64.deb
echo 'b494d0a413e4703b6cd5312403bea4d92246d6425b3be68c9bfbeb8cc4db8a55 bazel_0.4.5-linux-x86_64.deb' | sha256sum -c --strict || exit 1
sudo dpkg -i bazel_0.4.5-linux-x86_64.deb
Expand Down
3 changes: 3 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@ matrix:
packages:
- wget
- libjpeg-progs
- netpbm
- oracle-java8-installer

- os: osx
env: BUILD_SYSTEM=bazel
Expand All @@ -29,6 +31,7 @@ matrix:
- libpng-dev
- pkg-config
- libjpeg-progs
- netpbm

- os: osx
env: BUILD_SYSTEM=make
Expand Down
3 changes: 3 additions & 0 deletions BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@ cc_library(
"guetzli/*.h",
"guetzli/*.cc",
"guetzli/*.inc",
"clguetzli/*.cpp",
"clguetzli/*.h",
"clguetzli/*.hpp"
],
exclude = ["guetzli/guetzli.cc"],
),
Expand Down
56 changes: 56 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,3 +99,59 @@ attempts made.
Please note that JPEG images do not support alpha channel (transparency). If the
input is a PNG with an alpha channel, it will be overlaid on black background
before encoding.

# Extra features

**Note:** Please make sure that you can build guetzli successfully before adding the following features.

## Enable CUDA/OpenCL support

**Note:** Before adding [CUDA](https://developer.nvidia.com/cuda-zone) support, please [check](http://developer.nvidia.com/cuda-gpus) whether your GPU support CUDA or not.

**Note:** If you don't have an NVIDIA card that support CUDA, you can try [OpenCL](https://www.khronos.org/opencl/) instead. You can install any of the OpenCL SDKs, such as [Intel OpenCL SDK](https://software.intel.com/en-us/intel-opencl), [AMD OpenCL SDK](http://developer.amd.com/tools-and-sdks/opencl-zone/), etc.

**Note:** The steps for adding OpenCL support is very similar with adding CUDA support, so the following introduction will be only for CUDA.

### On POSIX systems
1. Follow the [Installation Guide for Linux ](https://developer.nvidia.com/compute/cuda/8.0/Prod2/docs/sidebar/CUDA_Installation_Guide_Linux-pdf) to setup [CUDA Toolkit](https://developer.nvidia.com/cuda-toolkit).
2. Edit `premake5.lua`, add `defines { "__USE_OPENCL__", "__USE_CUDA__" }` and `links { "OpenCL", "cuda" }` under `filter "action:gmake"`. Then do `premake5 --os=linux gmake` to update the makefile.
3. Edit `clguetzli/clguetzli.cl` and add `#define __USE_OPENCL__` at first line.
4. Run `make` and expect the binary to be created in `bin/Release/guetzli`.
5. Run `./compile.sh 64` or `./compile.sh 32` to build the 64 or 32 bits [ptx](http://docs.nvidia.com/cuda/parallel-thread-execution) file, and the ptx file will be copied to `bin/Release/clguetzli`.

### On Windows
1. Follow the [Installation Guide for Microsoft Windows](https://developer.nvidia.com/compute/cuda/8.0/Prod2/docs/sidebar/CUDA_Installation_Guide_Windows-pdf) to setup `CUDA Toolkit`.
2. Copy `<vs2015 dir>\VC\bin\amd64\vcvars64.bat` as `<guetzli dir>\vcvars64.bat`
3. Open the Visual Studio project and edit the project `Property Pages` as follows:
* Add `__USE_OPENCL__` and `__USE_CUDA__` to preprocessor definitions.
* Add `OpenCL.lib` and `cuda.lib` to additional dependencies.
* Add `$(CUDA_PATH)\include` to include directories.
* Add `$(CUDA_PATH)\lib\Win32` or `$(CUDA_PATH)\lib\x64` to library directories.
4. Edit `clguetzli/clguetzli.cl` and add `#define __USE_OPENCL__` at first line.
5. Build it.

### Usage
```bash
guetzli [--c|--cuda|--opencl] [other options] original.png output.jpg
guetzli [--c|--cuda|--opencl] [other options] original.jpg output.jpg
```
You can pass a `--c` parameter to enable the procedure optimization or `--cuda` parameter to use the CUDA acceleration or `--opencl` to use the OpenCL acceleration.

If you have any question about CUDA/OpenCL support, please contact [email protected], [email protected] or [email protected].
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe create a dropbox email like [email protected] that would go to all 3 of you (and could be adjusted on your end to add/remove people as necessary without having to update these docs)


## Enable full JPEG format support
### On POSIX systems
1. Install [libjpeg](http://libjpeg.sourceforge.net/).
If using your operating system
package manager, install development versions of the packages if the
distinction exists.
* On Ubuntu, do `apt-get install libjpeg8-dev`.
* On Fedora, do `dnf install libjpeg-devel`.
* On Arch Linux, do `pacman -S libjpeg`.
* On Alpine Linux, do `apk add libjpeg`.
2. Edit `premake5.lua`, add `defines {"__SUPPORT_FULL_JPEG__"}` and `links { "jpeg" }` under `filter "action:gmake"`. Then do `premake5 --os=linux gmake` to update the makefile.
3. Run `make` and expect the binary to be created in `bin/Release/guetzli`
### On Windows
1. Install `libjpeg-turbo` using vcpkg: `.\vcpkg install libjpeg-turbo`
2. Open the Visual Studio project and add `__SUPPORT_FULL_JPEG__` to preprocessor definitions in the project `Property Pages`.
3. Build it.
2 changes: 1 addition & 1 deletion appveyor.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ install:
- premake5.exe %TOOLSET%
- git clone https://github.com/Microsoft/vcpkg
- md vcpkg\downloads\nuget-3.5.0
- appveyor DownloadFile https://dist.nuget.org/win-x86-commandline/latest/nuget.exe -FileName %appveyor_build_folder%\vcpkg\downloads\nuget-3.5.0\nuget.exe
- appveyor DownloadFile https://dist.nuget.org/win-x86-commandline/v3.5.0/nuget.exe -FileName %appveyor_build_folder%\vcpkg\downloads\nuget-3.5.0\nuget.exe
- appveyor DownloadFile https://cmake.org/files/v3.8/cmake-3.8.0-rc1-win32-x86.zip -FileName %appveyor_build_folder%\vcpkg\downloads\cmake-3.8.0-rc1-win32-x86.zip
- 7z x %appveyor_build_folder%\vcpkg\downloads\cmake-3.8.0-rc1-win32-x86.zip
- cd vcpkg
Expand Down
Loading