Skip to content

go/printer: Comment-LineBreak-LineBreak-SelectorExpr-Comment AST modification issue #70978

@TACIXAT

Description

Go version

go version go1.23.4 windows/amd64

Output of go env in your module/workspace:

set GO111MODULE=
set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\user\AppData\Local\go-build
set GOENV=C:\Users\user\AppData\Roaming\go\env
set GOEXE=.exe
set GOEXPERIMENT=
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOINSECURE=
set GOMODCACHE=C:\Users\user\go\pkg\mod
set GONOPROXY=
set GONOSUMDB=
set GOOS=windows
set GOPATH=C:\Users\user\go
set GOPRIVATE=
set GOPROXY=https://proxy.golang.org,direct
set GOROOT=C:\Program Files\Go
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLCHAIN=auto
set GOTOOLDIR=C:\Program Files\Go\pkg\tool\windows_amd64
set GOVCS=
set GOVERSION=go1.23.4
set GODEBUG=
set GOTELEMETRY=local
set GOTELEMETRYDIR=C:\Users\user\AppData\Roaming\go\telemetry
set GCCGO=gccgo
set GOAMD64=v1
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=0
set GOMOD=NUL
set GOWORK=
set CGO_CFLAGS=-O2 -g
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-O2 -g
set CGO_FFLAGS=-O2 -g
set CGO_LDFLAGS=-O2 -g
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -fno-caret-diagnostics -Qunused-arguments -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=C:\Users\user\AppData\Local\Temp\go-build3559419192=/tmp/go-build -gno-record-gcc-switches

What did you do?

While refactoring a codebase of ours programmatically to rename a poorly named package we ran into this issue. The bug is triggered if you modify the X component of a SelectorExpr to be longer than where a following comment starts, and the selector expression has a comment + double newline preceding it.

package main

import (
	"go/ast"
	"go/parser"
	"go/printer"
	"go/token"
	"log"
	"bytes"
)

func init() {
	log.SetFlags(log.LstdFlags|log.Lshortfile)
}

var errCase string = `
package main

import (
	"log"
)

func main() {
	// Comment with line break after it

	log.Println() // Comment
}
`

func main() {
	fset := token.NewFileSet()
	root, err := parser.ParseFile(fset, "error_case.go", errCase,
		parser.SkipObjectResolution|parser.ParseComments)
	if err != nil {
		log.Fatal(err)
	}

	ast.Inspect(root, func (n ast.Node) bool {
		switch n.(type) {
		case *ast.SelectorExpr:
			se := n.(*ast.SelectorExpr)
			switch se.X.(type) {
			case *ast.Ident:
				ident := se.X.(*ast.Ident)
				ident.Name = "123456789012345"
			}
		}

		return true
	})

	ast.Print(fset, root)

	buf := &bytes.Buffer{}
	err = printer.Fprint(buf, fset, root)
	if err != nil {
		log.Fatal(err)
	}

	log.Println(string(buf.Bytes()))
}

What did you see happen?

If there is a selector expression with a comment + double newline preceding it (the second newline is required), and a comment following it, and you modify the selectorExpr.X to be longer than where the following comment starts, go/printer will intersperse the following comment in the middle of the selector expression.

This occurs due to this block in go/printer - https://github.com/golang/go/blob/master/src/go/printer/printer.go#L1018

package main

import (
        "log"
)

func main() {
        // Comment with line break after it

        123456789012345 // Comment
        .Println()
}

There is a variable p.impliedSemi (p = printer) to indicate that a newline implies a semi colon. In this function (print) there is also a local variable impliedSemi that p.impliedSemi is set to at the end of the function.

Before a token is printed, we backtrack and flush is called, flush calls intersperseComments to print any comments preceding that token for comments whose commentOffset > next.Offset and if p.impliedSemi is false (ignoring other parts of the conditional for the purpose of this bug report).

When the IDENT (se.X) token is encountered we set the local variable impliedSemi to true. The function then calls flush where we print the preceding comment. Now, the linked block will print any newlines (up to two) after that comment, before our token. I am not sure why, but this block now overrides impliedSemi and sets it to false. The function finishes and p.impliedSemi is set to false. Now the conditions for our bug are set.

Aside: I am not 100% on the intention of why this block sets impliedSemi to false. It has printed newlines, but those newlines come before the token (IDENT) that affected impliedSemi previously, and we have not yet updated p.impliedSemi.

So, on the next token (the .) we enter flush once again, which enters intersperseComments. Now, the comment following the selector expression meets the conditional that its start comes before the next token (.) and p.impliedSemi is false. So we print the comment before printing the rest of the selector expression.

What did you expect to see?

package main

import (
        "log"
)

func main() {
        // Comment with line break after it

        123456789012345.Println() // Comment
}

If there is not the very specific case of a comment + double newline preceding the selector expression, this works as intended. If there is a statement, or no newline after the comment, the selector expression does not get split up.

Depending on the intention behind setting impliedSemi in the linked block, some solutions may be -

  • Do not set impliedSemi in this block
  • Do not set impliedSemi in this block if the newlines are the result of backtracking for comments

Thank you!

Metadata

Assignees

No one assigned

    Labels

    FixPendingIssues that have a fix which has not yet been reviewed or submitted.NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions