Skip to content

cmd/compile: intrinsify bits.RotateLeft32 on mipsle #39139

Open
golang/crypto
#294
@assadobaid

Description

@assadobaid

What version of Go are you using (go version)?

Build env:
go1.14.3 linux/amd64

Runtime:
GOOS=linux
GOARCH=mipsle 
GOMIPS=softfloat

Does this issue reproduce with the latest release?

Yes

The performance of TLS1.3 has decreased significantly in Go version 1.14.x and latest x/crypto master branch.

What did you do?

Our application uses TLS1.3 to stream real-time video data. When we upgraded go version from 1.13 to 1.14.3 the CPU performance decreased and the latency increased.
When we run the same test in go 1.13 and 1.14.3 we can see that the amount of time that Chach20 Poly1305 takes in 1.14 is almost double as much as in 1.13.x.
We see the problem in 1.14 both with the released version of x/crypto and with latest master of x/crypto.
We tried also with TLS1.2 and still see the issue.

What did you expect to see?

Same performance across versions.

What did you see instead?

In our 4 minutes test we can see that the time we spend in crypto increased from 54 seconds in total to 96 seconds.

Go1.13

Link to pprof svg graph

  flat  flat%   sum%        cum   cum%
23.20s 19.59% 19.59%     26.67s 22.52%  golang.org/x/crypto/poly1305.updateGeneric
17.25s 14.57% 34.16%     19.12s 16.15%  syscall.Syscall
16.23s 13.71% 47.87%     16.23s 13.71%  golang.org/x/crypto/internal/chacha20.quarterRound
14.32s 12.09% 59.96%     14.32s 12.09%  runtime.usleep
 5.96s  5.03% 64.99%      5.96s  5.03%  runtime.futex
 5.14s  4.34% 69.34%      5.14s  4.34%  runtime.memmove
 4.09s  3.45% 72.79%      4.09s  3.45%  runtime._LostSIGPROFDuringAtomic64
 3.54s  2.99% 75.78%      3.54s  2.99%  encoding/binary.littleEndian.Uint32
 3.28s  2.77% 78.55%      3.28s  2.77%  golang.org/x/crypto/internal/chacha20.xor
 2.90s  2.45% 81.00%     22.61s 19.09%  golang.org/x/crypto/internal/chacha20.(*Cipher).XORKeyStream
 2.32s  1.96% 82.96%      2.32s  1.96%  runtime.nanotime
 1.59s  1.34% 84.30%      1.59s  1.34%  runtime.epollwait
 1.12s  0.95% 85.25%     20.89s 17.64%  runtime.sysmon
 0.95s   0.8% 86.05%      1.88s  1.59%  runtime.retake
 0.71s   0.6% 86.65%      0.71s   0.6%  runtime.memclrNoHeapPointers
 0.62s  0.52% 87.17%      0.65s  0.55%  runtime.lock
 0.52s  0.44% 87.61%      0.78s  0.66%  runtime.unlock
 0.37s  0.31% 87.92%      1.10s  0.93%  runtime.mallocgc
 0.34s  0.29% 88.21%      6.26s  5.29%  runtime.schedule
 0.33s  0.28% 88.49%      5.54s  4.68%  runtime.findrunnable
 0.27s  0.23% 88.72%     13.44s 11.35%  crypto/tls.(*Conn).write
 0.26s  0.22% 88.94%     57.90s 48.90%  crypto/tls.(*halfConn).encrypt
 0.23s  0.19% 89.13%      1.06s   0.9%  runtime.reentersyscall
 0.20s  0.17% 89.30%      0.90s  0.76%  crypto/tls.(*Conn).SetWriteDeadline
 0.20s  0.17% 89.47%     71.67s 60.53%  crypto/tls.(*Conn).writeRecordLocked
 0.18s  0.15% 89.62%     54.14s 45.72%  golang.org/x/crypto/chacha20poly1305.(*chacha20poly1305).sealGeneric
 0.18s  0.15% 89.77%         1s  0.84%  runtime.exitsyscall
 0.16s  0.14% 89.91%      1.76s  1.49%  runtime.netpoll
 0.13s  0.11% 90.02%     26.80s 22.63%  golang.org/x/crypto/poly1305.(*macGeneric).Write
 0.13s  0.11% 90.13%     13.02s 11.00%  internal/poll.(*FD).Write

Go1.14.3 and x/crypto master

Link to pprof svg graph

  flat  flat%   sum%        cum   cum%
23.89s 12.98% 12.98%     23.89s 12.98%  runtime.usleep
19.10s 10.38% 23.35%     49.38s 26.83%  golang.org/x/crypto/chacha20.(*Cipher).xorKeyStreamBlocksGeneric
16.10s  8.75% 32.10%     17.50s  9.51%  syscall.Syscall
16.06s  8.72% 40.82%     25.44s 13.82%  golang.org/x/crypto/chacha20.quarterRound
12.19s  6.62% 47.45%     12.19s  6.62%  runtime.futex
11.50s  6.25% 53.69%     11.83s  6.43%  math/bits.Mul64
10.79s  5.86% 59.56%     45.49s 24.71%  golang.org/x/crypto/poly1305.updateGeneric
 8.39s  4.56% 64.11%      8.70s  4.73%  math/bits.RotateLeft32
 7.13s  3.87% 67.99%      7.30s  3.97%  math/bits.Add64
 4.30s  2.34% 70.32%      4.30s  2.34%  runtime._LostSIGPROFDuringAtomic64
 4.29s  2.33% 72.65%      4.37s  2.37%  encoding/binary.littleEndian.Uint64
 4.23s  2.30% 74.95%     20.18s 10.96%  golang.org/x/crypto/poly1305.mul64
 3.97s  2.16% 77.11%      4.12s  2.24%  golang.org/x/crypto/chacha20.addXor
 3.86s  2.10% 79.20%     15.76s  8.56%  golang.org/x/crypto/poly1305.bitsMul64
 3.81s  2.07% 81.27%      3.81s  2.07%  runtime.nanotime1
 3.34s  1.81% 83.09%      3.34s  1.81%  runtime.epollwait
 3.18s  1.73% 84.82%      3.18s  1.73%  runtime.memmove
 3.01s  1.64% 86.45%      3.02s  1.64%  runtime.asyncPreempt
 2.19s  1.19% 87.64%      3.12s  1.69%  runtime.timeSleepUntil
 1.81s  0.98% 88.62%     36.83s 20.01%  runtime.sysmon
 1.76s  0.96% 89.58%      4.62s  2.51%  golang.org/x/crypto/poly1305.add128
 1.47s   0.8% 90.38%      3.10s  1.68%  runtime.retake
 0.94s  0.51% 90.89%      1.04s  0.56%  runtime.lock
 0.63s  0.34% 91.23%     13.08s  7.11%  runtime.findrunnable
 0.46s  0.25% 91.48%      7.77s  4.22%  golang.org/x/crypto/poly1305.bitsAdd64 (partial-inline)
 0.42s  0.23% 91.71%    100.46s 54.57%  crypto/tls.(*halfConn).encrypt
 0.41s  0.22% 91.93%      3.94s  2.14%  runtime.netpoll
 0.31s  0.17% 92.10%     12.78s  6.94%  crypto/tls.(*Conn).write
 0.31s  0.17% 92.27%     49.89s 27.10%  golang.org/x/crypto/chacha20.(*Cipher).XORKeyStream
 0.26s  0.14% 92.41%     45.86s 24.91%  golang.org/x/crypto/poly1305.(*macGeneric).Write

Metadata

Metadata

Assignees

No one assigned

    Labels

    NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.Performancecompiler/runtimeIssues related to the Go compiler and/or runtime.

    Type

    No type

    Projects

    • Status

      Triage Backlog

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions