Skip to content
Open
10 changes: 9 additions & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,20 @@

EXTENSION = []

CC = ['52', '53', '60', '61', '62', '70', '72', '75', '80']

if os.getenv('USE_OPENMP', '1') == '1':
EXTRA_COMPILE_ARGS.append('-fopenmp')

if os.getenv('USE_CUDA', '1') == '1':
EXTRA_COMPILE_ARGS.append('-DUSE_CUDA')

GENERATE_CODES = []

for cc in CC:
GENERATE_CODES.append('--generate-code')
GENERATE_CODES.append(f'arch=compute_{cc},code=compute_{cc}')

EXTENSION.append(
CUDAExtension(
name='involution',
Expand All @@ -27,7 +35,7 @@
],
extra_compile_args={
'cxx': EXTRA_COMPILE_ARGS,
'nvcc': ['-O3'],
'nvcc': ['-O3'] + GENERATE_CODES,
}
)
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@d-li14
I changed the arguments of NVCC to be optimized for different architectures.
Hopefully this will increase the speed.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shikishima-TasakiLab
Thanks for your efforts. However, the new code still does not lead to an expected speedup from my side.

Expand Down