Commit d4e5151
Inductor cpp wrapper: add -ffast-math in linking flag (pytorch#104332)
Fix cpp wrapper CPU performance gap on `swsl_resnext101_32x16d` compared with the default python wrapper.
The pre-trained weights of `swsl_resnext101_32x16d` contains denormal numbers (close to 0.0).
Linking with `-ffast-math` will make the CPU flush denormals.
For the default python wrapper, the compilation and linking are done in one command thus `-ffast-math` will take effect in both compilation and linking.
CPP wrapper leverages cpp_extension which will do the compilation and linking in two stages, thus we need to explicitly add `-ffast-math` as a linking flag.
Single thread single batch on ICX:
<html xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta name=ProgId content=Excel.Sheet>
<meta name=Generator content="Microsoft Excel 15">
<link id=Main-File rel=Main-File
href="file:///C:/Users/chunyuan/AppData/Local/Temp/msohtmlclip1/01/clip.htm">
<link rel=File-List
href="file:///C:/Users/chunyuan/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml">
</head>
<body link=blue vlink=purple>
| time (s) default python wrapper | time (s) cpp wrapper before fix | time (s) cpp wrapper after fix
-- | -- | -- | --
swsl_resnext101_32x16d | 0.459097836 | 13.82326214 | 0.448116195
</body>
</html>
Pull Request resolved: pytorch#104332
Approved by: https://github.com/jgong5, https://github.com/desertfire, https://github.com/EikanWang1 parent 732067e commit d4e5151
1 file changed
+6
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
925 | 925 | | |
926 | 926 | | |
927 | 927 | | |
928 | | - | |
| 928 | + | |
| 929 | + | |
| 930 | + | |
| 931 | + | |
| 932 | + | |
| 933 | + | |
929 | 934 | | |
930 | 935 | | |
931 | 936 | | |
| |||
0 commit comments