Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GAPI: reuse copy_through_move_t in the gasync.cpp file #1

Open
wants to merge 174 commits into
base: sole_tbb_executor
Choose a base branch
from

Conversation

anton-potapov
Copy link
Owner

GAPI: reuse copy_through_move_t in the gasync.cpp file

opencv#17851 introduced separate header for copy_through_move_t. This PR removes it's local duplicate in gasync.cpp to reuse one from the header

legrosbuffle and others added 30 commits August 21, 2020 14:06
This is useful for non power-of-two sizes when WITH_IPP is not an option.

This shows consistent improvement over openCV benchmarks, and we measure
even larger improvements on our internal workloads.

For example, for 320x480, `32FC*`, we can see a ~5% improvement}, as
`320=2^6*5` and `480=2^5*3*5`, so the improved radix3 version is used.
`64FC*` is flat as expected, as we do not specialize the functors for `double`
in this change.

```
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC1, 0, false)                                1.239  1.153     1.07
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC1, 0, true)                                 0.991  0.926     1.07
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC1, DFT_COMPLEX_OUTPUT, false)               1.367  1.281     1.07
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC1, DFT_COMPLEX_OUTPUT, true)                1.114  1.049     1.06
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC1, DFT_INVERSE, false)                      1.313  1.254     1.05
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC1, DFT_INVERSE, true)                       1.027  0.977     1.05
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false)   1.296  1.217     1.06
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)    1.039  0.963     1.08
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC1, DFT_ROWS, false)                         0.542  0.524     1.04
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC1, DFT_ROWS, true)                          0.293  0.277     1.06
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC1, DFT_SCALE, false)                        1.265  1.175     1.08
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC1, DFT_SCALE, true)                         1.004  0.942     1.07
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 64FC1, 0, false)                                1.292  1.280     1.01
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 64FC1, 0, true)                                 1.038  1.030     1.01
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 64FC1, DFT_COMPLEX_OUTPUT, false)               1.484  1.488     1.00
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 64FC1, DFT_COMPLEX_OUTPUT, true)                1.222  1.224     1.00
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 64FC1, DFT_INVERSE, false)                      1.380  1.355     1.02
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 64FC1, DFT_INVERSE, true)                       1.117  1.133     0.99
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 64FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false)   1.372  1.383     0.99
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 64FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)    1.117  1.127     0.99
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 64FC1, DFT_ROWS, false)                         0.546  0.539     1.01
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 64FC1, DFT_ROWS, true)                          0.293  0.299     0.98
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 64FC1, DFT_SCALE, false)                        1.351  1.339     1.01
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 64FC1, DFT_SCALE, true)                         1.099  1.092     1.01
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC2, 0, false)                                2.235  2.123     1.05
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC2, 0, true)                                 1.843  1.727     1.07
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC2, DFT_COMPLEX_OUTPUT, false)               2.189  2.109     1.04
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC2, DFT_COMPLEX_OUTPUT, true)                1.827  1.754     1.04
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC2, DFT_INVERSE, false)                      2.392  2.309     1.04
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC2, DFT_INVERSE, true)                       1.951  1.865     1.05
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC2, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false)   2.391  2.293     1.04
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC2, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)    1.954  1.882     1.04
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC2, DFT_ROWS, false)                         0.811  0.815     0.99
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC2, DFT_ROWS, true)                          0.426  0.437     0.98
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC2, DFT_SCALE, false)                        2.268  2.152     1.05
dft::Size_MatType_FlagsType_NzeroRows::(320x480, 32FC2, DFT_SCALE, true)                         1.893  1.788     1.06
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC1, 0, false)                                4.546  4.395     1.03
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC1, 0, true)                                 3.616  3.426     1.06
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC1, DFT_COMPLEX_OUTPUT, false)               4.843  4.668     1.04
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC1, DFT_COMPLEX_OUTPUT, true)                3.825  3.748     1.02
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC1, DFT_INVERSE, false)                      4.720  4.525     1.04
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC1, DFT_INVERSE, true)                       3.743  3.601     1.04
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false)   4.755  4.527     1.05
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)    3.744  3.586     1.04
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC1, DFT_ROWS, false)                         1.992  2.012     0.99
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC1, DFT_ROWS, true)                          1.048  1.048     1.00
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC1, DFT_SCALE, false)                        4.625  4.451     1.04
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC1, DFT_SCALE, true)                         3.643  3.491     1.04
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 64FC1, 0, false)                                4.499  4.488     1.00
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 64FC1, 0, true)                                 3.559  3.555     1.00
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 64FC1, DFT_COMPLEX_OUTPUT, false)               5.155  5.165     1.00
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 64FC1, DFT_COMPLEX_OUTPUT, true)                4.103  4.101     1.00
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 64FC1, DFT_INVERSE, false)                      5.484  5.474     1.00
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 64FC1, DFT_INVERSE, true)                       4.617  4.518     1.02
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 64FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false)   5.547  5.509     1.01
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 64FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)    4.553  4.554     1.00
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 64FC1, DFT_ROWS, false)                         2.067  2.018     1.02
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 64FC1, DFT_ROWS, true)                          1.104  1.079     1.02
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 64FC1, DFT_SCALE, false)                        4.665  4.619     1.01
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 64FC1, DFT_SCALE, true)                         3.698  3.681     1.00
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC2, 0, false)                                8.774  8.275     1.06
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC2, 0, true)                                 6.975  6.527     1.07
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC2, DFT_COMPLEX_OUTPUT, false)               8.720  8.270     1.05
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC2, DFT_COMPLEX_OUTPUT, true)                6.928  6.532     1.06
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC2, DFT_INVERSE, false)                      9.272  8.862     1.05
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC2, DFT_INVERSE, true)                       7.323  6.946     1.05
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC2, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false)   9.262  8.768     1.06
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC2, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)    7.298  6.871     1.06
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC2, DFT_ROWS, false)                         3.766  3.639     1.03
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC2, DFT_ROWS, true)                          1.932  1.889     1.02
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC2, DFT_SCALE, false)                        8.865  8.417     1.05
dft::Size_MatType_FlagsType_NzeroRows::(800x600, 32FC2, DFT_SCALE, true)                         7.067  6.643     1.06
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC1, 0, false)                              10.014 10.141    0.99
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC1, 0, true)                               7.600  7.632     1.00
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC1, DFT_COMPLEX_OUTPUT, false)             11.059 11.283    0.98
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC1, DFT_COMPLEX_OUTPUT, true)              8.475  8.552     0.99
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC1, DFT_INVERSE, false)                    12.678 12.789    0.99
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC1, DFT_INVERSE, true)                     10.445 10.359    1.01
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false) 12.626 12.925    0.98
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)  10.538 10.553    1.00
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC1, DFT_ROWS, false)                       5.041  5.084     0.99
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC1, DFT_ROWS, true)                        2.595  2.607     1.00
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC1, DFT_SCALE, false)                      10.231 10.330    0.99
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC1, DFT_SCALE, true)                       7.786  7.815     1.00
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 64FC1, 0, false)                              13.597 13.302    1.02
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 64FC1, 0, true)                               10.377 10.207    1.02
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 64FC1, DFT_COMPLEX_OUTPUT, false)             15.940 15.545    1.03
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 64FC1, DFT_COMPLEX_OUTPUT, true)              12.299 12.230    1.01
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 64FC1, DFT_INVERSE, false)                    15.270 15.181    1.01
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 64FC1, DFT_INVERSE, true)                     12.757 12.339    1.03
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 64FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false) 15.512 15.157    1.02
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 64FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)  12.505 12.635    0.99
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 64FC1, DFT_ROWS, false)                       6.359  6.255     1.02
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 64FC1, DFT_ROWS, true)                        3.314  3.248     1.02
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 64FC1, DFT_SCALE, false)                      13.937 13.733    1.01
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 64FC1, DFT_SCALE, true)                       10.782 10.495    1.03
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC2, 0, false)                              18.985 18.926    1.00
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC2, 0, true)                               14.256 14.509    0.98
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC2, DFT_COMPLEX_OUTPUT, false)             18.696 19.021    0.98
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC2, DFT_COMPLEX_OUTPUT, true)              14.290 14.429    0.99
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC2, DFT_INVERSE, false)                    20.135 20.296    0.99
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC2, DFT_INVERSE, true)                     15.390 15.512    0.99
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC2, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false) 20.121 20.354    0.99
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC2, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)  15.341 15.605    0.98
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC2, DFT_ROWS, false)                       8.932  9.084     0.98
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC2, DFT_ROWS, true)                        4.539  4.649     0.98
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC2, DFT_SCALE, false)                      19.137 19.303    0.99
dft::Size_MatType_FlagsType_NzeroRows::(1280x1024, 32FC2, DFT_SCALE, true)                       14.565 14.808    0.98
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC1, 0, false)                              22.553 21.171    1.07
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC1, 0, true)                               17.850 16.390    1.09
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC1, DFT_COMPLEX_OUTPUT, false)             24.062 22.634    1.06
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC1, DFT_COMPLEX_OUTPUT, true)              19.342 17.932    1.08
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC1, DFT_INVERSE, false)                    28.609 27.326    1.05
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC1, DFT_INVERSE, true)                     24.591 23.289    1.06
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false) 28.667 27.467    1.04
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)  24.671 23.309    1.06
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC1, DFT_ROWS, false)                       9.458  9.077     1.04
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC1, DFT_ROWS, true)                        4.709  4.566     1.03
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC1, DFT_SCALE, false)                      22.791 21.583    1.06
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC1, DFT_SCALE, true)                       18.029 16.691    1.08
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 64FC1, 0, false)                              25.238 24.427    1.03
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 64FC1, 0, true)                               19.636 19.270    1.02
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 64FC1, DFT_COMPLEX_OUTPUT, false)             28.342 27.957    1.01
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 64FC1, DFT_COMPLEX_OUTPUT, true)              22.413 22.477    1.00
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 64FC1, DFT_INVERSE, false)                    26.465 26.085    1.01
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 64FC1, DFT_INVERSE, true)                     21.972 21.704    1.01
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 64FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false) 26.497 26.127    1.01
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 64FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)  22.010 21.523    1.02
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 64FC1, DFT_ROWS, false)                       11.188 10.774    1.04
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 64FC1, DFT_ROWS, true)                        6.094  5.916     1.03
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 64FC1, DFT_SCALE, false)                      25.728 24.934    1.03
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 64FC1, DFT_SCALE, true)                       20.077 19.653    1.02
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC2, 0, false)                              43.834 40.726    1.08
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC2, 0, true)                               35.198 32.218    1.09
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC2, DFT_COMPLEX_OUTPUT, false)             43.743 40.897    1.07
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC2, DFT_COMPLEX_OUTPUT, true)              35.240 32.226    1.09
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC2, DFT_INVERSE, false)                    46.022 42.612    1.08
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC2, DFT_INVERSE, true)                     36.779 33.961    1.08
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC2, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false) 46.396 42.723    1.09
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC2, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)  37.025 33.874    1.09
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC2, DFT_ROWS, false)                       17.334 16.832    1.03
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC2, DFT_ROWS, true)                        9.212  8.970     1.03
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC2, DFT_SCALE, false)                      44.190 41.211    1.07
dft::Size_MatType_FlagsType_NzeroRows::(1920x1080, 32FC2, DFT_SCALE, true)                       35.900 32.888    1.09
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC1, 0, false)                              40.948 38.256    1.07
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC1, 0, true)                               33.825 30.759    1.10
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC1, DFT_COMPLEX_OUTPUT, false)             53.210 53.584    0.99
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC1, DFT_COMPLEX_OUTPUT, true)              46.356 46.712    0.99
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC1, DFT_INVERSE, false)                    47.471 47.213    1.01
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC1, DFT_INVERSE, true)                     40.491 41.363    0.98
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false) 46.724 47.049    0.99
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)  40.834 41.381    0.99
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC1, DFT_ROWS, false)                       14.508 14.490    1.00
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC1, DFT_ROWS, true)                        7.832  7.828     1.00
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC1, DFT_SCALE, false)                      41.491 38.341    1.08
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC1, DFT_SCALE, true)                       34.587 31.208    1.11
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 64FC1, 0, false)                              65.155 63.173    1.03
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 64FC1, 0, true)                               56.091 54.752    1.02
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 64FC1, DFT_COMPLEX_OUTPUT, false)             71.549 70.626    1.01
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 64FC1, DFT_COMPLEX_OUTPUT, true)              62.319 61.437    1.01
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 64FC1, DFT_INVERSE, false)                    61.480 59.540    1.03
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 64FC1, DFT_INVERSE, true)                     54.047 52.650    1.03
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 64FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false) 61.752 61.366    1.01
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 64FC1, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)  54.400 53.665    1.01
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 64FC1, DFT_ROWS, false)                       20.219 19.704    1.03
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 64FC1, DFT_ROWS, true)                        11.145 10.868    1.03
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 64FC1, DFT_SCALE, false)                      66.220 64.525    1.03
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 64FC1, DFT_SCALE, true)                       57.389 56.114    1.02
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC2, 0, false)                              86.761 88.128    0.98
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC2, 0, true)                               75.528 76.725    0.98
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC2, DFT_COMPLEX_OUTPUT, false)             86.750 88.223    0.98
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC2, DFT_COMPLEX_OUTPUT, true)              75.830 76.809    0.99
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC2, DFT_INVERSE, false)                    91.728 92.161    1.00
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC2, DFT_INVERSE, true)                     78.797 79.876    0.99
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC2, DFT_INVERSE|DFT_COMPLEX_OUTPUT, false) 92.163 92.177    1.00
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC2, DFT_INVERSE|DFT_COMPLEX_OUTPUT, true)  78.957 79.863    0.99
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC2, DFT_ROWS, false)                       24.781 25.576    0.97
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC2, DFT_ROWS, true)                        13.226 13.695    0.97
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC2, DFT_SCALE, false)                      87.990 89.324    0.99
dft::Size_MatType_FlagsType_NzeroRows::(2048x2048, 32FC2, DFT_SCALE, true)                       76.732 77.869    0.99
```
A running GMainLoop processes many events on the GLib/GStreamer
world. While some things may work without it, many others wont.
Examples of these are signals, timers and many other source
events. The problem becomes more concerning by the fact that
some GStreamer elements rely on signals to work.

This commit allows the user to specify an OpenCV option to
start a main loop, if needed. Since the loop blocks, this is
done in a separate thread.
Set queue size = 1 to Copy island right after the desync.
In this case, Copy won't read more data from a "last_written"
container than required, while feeding the desynchronized path.

Sometimes Copy don't get fused into an island and behaves
on its own -- in this case, it reads more data in advance
so the slow (desync) part actually processes some data in-sync
(more than actually required)
…ig-for-ie-params

Expand ie::Params to support config

* Add config to IE params

* Add test

* Remove comments from tests

* Rename to pluginConfig

* Add one more overloads for pluginConfig

* Add more tests
* G-API: Introduce ONNX backend for Inference

- Basic operations are implemented (Infer, -ROI, -List, -List2);
- Implemented automatic preprocessing for ONNX models;
- Test suite is extended with `OPENCV_GAPI_ONNX_MODEL_PATH` env for test data
  (test data is an ONNX Model Zoo repo snapshot);
- Fixed kernel lookup logic in core G-API:
  - Lookup NN kernels not in the default package, but in the associated
    backend's aux package. Now two NN backends can work in the same graph.
- Added Infer SSD demo and a combined ONNX/IE demo;

* G-API/ONNX: Fix some of CMake issues

Co-authored-by: Pashchenkov, Maxim <[email protected]>
- for better compatibility with Ceres 2.0.0 CMake scripts
The change is needed due to removing default opset namespace for Unsqueeze
in the scope of this refactoring activity: openvinotoolkit/openvino#2767

Signed-off-by: Roman Kazantsev <[email protected]>
- add opencv/modules
- add opencv_contrib/modules
Code snippets need a section marked with ### above to render properly
alalek and others added 29 commits November 23, 2020 19:16
Support for Pool1d layer for OpenCV and OpenCL targets

* Initial version of Pool1d support

* Fix variable naming

* Fix 1d pooling for OpenCL

* Change support logic, remove unnecessary variable, split the tests

* Remove other depricated variables

* Fix warning. Check tests

* Change support check logic

* Change support check logic, 2
…-tests

G-API: Adding skip for GraphMeta tests

* Added skip for GraphMeta tests

* Removed false
…catalyst-xcframework

Support XCFramework builds, Catalyst

* Early work on xcframework support

* Improve legibility

* Somehow this works

* Specify ABIs in a place where they won't get erased

If you pass in the C/CXX flags from the Python script, they won't be respected. By doing it in the actual toolchain, the options are respected and Catalyst successfully links.

* Clean up and push updates

* Actually use Catalyst ABI

Needed to specify EXE linker flags to get compiler tests to link to the Catalyst ABIs.

* Clean up

* Revert changes to common toolchain that don't matter

* Try some things

* Support Catalyst build in OSX scripts

* Remove unnecessary iOS reference to AssetsLibrary framework

* Getting closer

* Try some things, port to Python 3

* Some additional fixes

* Point Cmake Plist gen to osx directory for Catalyst targets

* Remove dynamic lib references for Catalyst, copy iOS instead of macos

* Add flag for building only specified archs, remove iOS catalyst refs

* Add build-xcframework.sh

* Update build-xcframework.sh

* Add presumptive Apple Silicon support

* Add arm64 iphonesimulator target

* Fix xcframework build

* Working on arm64 iOS simulator

* Support 2.7 (replace run with check_output)

* Correctly check output of uname_m against arch

* Clean up

* Use lipo for intermediate frameworks, add python script

Remove unneeded __init__.py

* Simplify python xcframework build script

* Add --only-64-bit flag

* Add --framework-name flag

* Document

* Commit to f-strings, improve console output

* Add i386 to iphonesimulator platform in xcframework generator

* Enable objc for non-Catalyst frameworks

* Fix xcframework builder for paths with spaces

* Use arch when specifying Catalyst build platform in build command

* Fix incorrect settings for framework_name argparse configuration

* Prefer underscores instead of hyphens in new flags

* Move Catalyst flags to where they'll actually get used

* Use --without=objc on Catalyst target for now

* Remove get_or_create_folder and simplify logic

* Remove unused import

* Tighten up help text

* Document

* Move common functions into cv_build_utils

* Improve documentation

* Remove old build script

* Add readme

* Check for required CMake and Xcode versions

* Clean up TODOs and re-enable `copy_samples()`

Remove TODO

Fixup

* Add missing print_function import

* Clarify CMake dependency documentation

* Revert python2 change in gen_objc

* Remove unnecessary builtins imports

* Remove trailing whitespace

* Avoid building Catalyst unless specified

This makes Catalyst support a non-breaking change, though defaults should be specified when a breaking change is possible.

* Prevent lipoing for the same archs on different platforms before build

* Rename build-xcframework.py to build_xcframework.py

* Check for duplicate archs more carefully

* Prevent sample copying error when directory already exists

This can happen when building multiple architectures for the same platform.

* Simplify code for checking for default archs

* Improve build_xcframework.py header text

* Correctly resolve Python script paths

* Parse only known args in ios/osx build_framework.py

* Pass through uncaptured args in build_xcframework to osx/ios build

* Fix typo

* Fix typo

* Fix unparameterized build path for intermediate frameworks

* Fix dyanmic info.plist path for catalyst

* Fix utf-8 Python 3 issue

* Add dynamic flag to osx script

* Rename platform to platforms, remove armv7s and i386

* Fix creation of dynamic framework on maccatalyst and macos

* Update platforms/apple/readme.md

* Add `macos_archs` flag and deprecate `archs` flag

* Allow specification of archs when generating xcframework from terminal

* Change xcframework platform argument names to match archs flag names

* Remove platforms as a concept and shadow archs flags from ios/osx .py

* Improve documentation

* Fix building of objc module on Catalyst, excluding Swift

* Clean up build folder logic a bit

* Fix framework_name flag

* Drop passthrough_args, use unknown_args instead

* minor: coding style changes

Co-authored-by: Chris Ballinger <[email protected]>
- fix opencv#18906
- unable to add related test cases as there is
  no public access to Context:Impl refcounts
xrange was abandoned and doesn't exist in Python 3. range() works just the same
(4.x) build: Xcode 12 support

* build: xcode 12 support, cmake fixes

* ts: eliminate clang 11 warnigns

* 3rdparty: clang 11 warnings

* features2d: eliminate build warnings

* test: warnings

* gapi: warnings from 18928
[G-API] Wrap GArray

* Wrap GArray for output

* Collect in/out info in graph

* Add imgproc tests

* Add cv::Point2f

* Update test_gapi_imgproc.py

* Fix comments to review
* TBB executor for GAPI

 - the sole executor
 - unit tests for it
 - no usage in the GAPI at the momnet

* TBB executor for GAPI

 - introduced new overload of execute to explicitly accept tbb::arena
   argument
 - added more basic tests
 - moved arena creation code into tests
 -

* TBB executor for GAPI

 - fixed compie errors & warnings

* TBB executor for GAPI

 - split all-in-one execute() function into logicaly independant parts

* TBB executor for GAPI

 - used util::variant in in the tile_node

* TBB executor for GAPI

 - moved copy_through_move to separate header
 - rearranged details staff in proper namespaces
 - moved all implementation into detail namespace

* TBB executor for GAPI

 - fixed build error with TBB 4.4.
 - fixed build warnings

* TBB executor for GAPI

 - aligned strings width
 - fixed spaces in expressions
 - fixed english grammar
 - minor improvements

* TBB executor for GAPI

 - added more comments
 - minor improvements

* TBB executor for GAPI

 - changed ITT_ prefix for macroses to GAPI_ITT

* TBB executor for GAPI

 - no more "unused" warning for GAPI_DbgAssert
 - changed local assert macro to man onto GAPI_DbgAssert

* TBB executor for GAPI

 - file renamings
 - changed local assert macro to man onto GAPI_DbgAsse

* TBB executor for GAPI

 - test file renamed
 - add more comments

* TBB executor for GAPI

 - minor clenups and cosmetic changes

* TBB executor for GAPI

 - minor clenups and cosmetic changes

* TBB executor for GAPI

 - changed spaces and curly braces alignment

* TBB executor for GAPI

 - minor cleanups

* TBB executor for GAPI

 - minor cleanups
[G-API]: kmeans() Standard Kernel Implementation

* cv::gapi::kmeans kernel implementation
 - 4 overloads:
    - standard GMat - for any dimensionality
    - GMat without bestLabels initialization
    - GArray<Point2f> - for 2D
    - GArray<Point3f> - for 3D
 - Accuracy tests:
   - for every input - 2 tests
   1) without initializing. In this case, no comparison with cv::kmeans is done as kmeans uses random auto-initialization
   2) with initialization
   - in both cases, only 1 attempt is done as after first attempt kmeans initializes bestLabels randomly

* Addressing comments
 - bestLabels is returned to its original place among parameters
 - checkVector and isPointsVector functions are merged into one, shared between core.hpp & imgproc.hpp by placing it into gmat.hpp (and implementation - to gmat.cpp)
 - typos corrected

* addressing comments
 - unified names in tests
 - const added
 - typos

* Addressing comments
 - fixed the doc note
 - ddepth -> expectedDepth, `< 0 ` -> `== -1`

* Fix unsupported cases of input Mat
 - supported: multiple channels, reversed width
 - added test cases for those
 - added notes in docs
 - refactored checkVector to return dimentionality along with quantity

* Addressing comments
 - makes chackVector smaller and (maybe) clearer

* Addressing comments

* Addressing comments
 - cv::checkVector -> cv::gapi::detail

* Addressing comments
 - Changed checkVector: returns bool, quantity & dimensionality as references

* Addressing comments
 - Polishing checkVector
 - FIXME added

* Addressing discussion
 - checkVector: added overload, separate two different functionalities
 - depth assert - out of the function

* Addressing comments
 - quantity -> amount, dimensionality -> dim
 - Fix typos

* Addressing comments
 - fix docs
 - use 2 variable's definitions instead of one (for all non-trivial variables)
@anton-potapov anton-potapov force-pushed the reuse_move_through_copy branch from 81a9ed9 to 74b6646 Compare November 30, 2020 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.