-
Notifications
You must be signed in to change notification settings - Fork 1.7k
[CodeCamp #15] Add sigmoid focal loss cpu impl #2536
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[CodeCamp #15] Add sigmoid focal loss cpu impl #2536
Conversation
* Add sigmoid focal loss cpu implementation * Add focal loss unit test on cpu * Add ops to EN/ZH documents
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this op can be implemented with native torch. Please provide a benchmark between this ops and torch implementation.
Thanks for reviewer's opinion. I have some question about the benchmark. Could you please help me figure out?
Thanks again^_^ |
|
Accoding to reviewer's opinion: 1. replace expf with exp 2. direct bind KernelLaucher function to impl function
data case: batch: [2, 4, 8, ..., 8096] num_classes: [2, 4, 8, ..., 4096] beta conclusion: 1. In small data size, implemented op is superior. 2. In medium and large data size, implemented op is superior. Note: 1. implemented op is compated to torchvison.ops.sigmoid_focal_loss, and the latter is different in 'targets' arg. In benchmark, use different data to overcome this.
I have used pytest-benchmark to benchmark the implemented sigmoid focal loss op. Benchmark code is in the latest commit. And on my 8 core notebook, conclusions could be drawn as below: 1. In small data size, implemented op is superior. 2. In medium and large data size, torch implemented is superior. The processed benchmark data could be seen in benchmark_round10.csv and the raw benchmark data is in Next step is to optimize this op in medium and large data size, and maybe the torchvision implementation may be a good start point. Any suggestions are welcomed^_^ |
Cool. |
Motivation
This PR adds sigmoid focal loss CPU implementation on the basis of CUDA implementation.
Modification
Add sigmoid focal loss cpu implementation
Add focal loss unit test on cpu
Add ops to EN/ZH documents
Checklist
Before PR:
After PR: