Skip to content

fix(pm): print registry auto-select hint on stderr#2941

Draft
elrrrrrrr wants to merge 1 commit into
nextfrom
chore/registry-tip-timing
Draft

fix(pm): print registry auto-select hint on stderr#2941
elrrrrrrr wants to merge 1 commit into
nextfrom
chore/registry-tip-timing

Conversation

@elrrrrrrr
Copy link
Copy Markdown
Contributor

Summary

select_fastest_registry at crates/pm/src/util/registry.rs:43
the auto-pick branch of init_registry
(crates/pm/src/util/user_config.rs:33-58) that fires when none of
the CLI --registry flag, the $UTOO_REGISTRY environment
variable, and the merged global+local ~/.utoo/config.toml and
.utoo.toml config file (assembled in
crates/pm/src/util/config_file.rs:34-64) supplies a registry URL
— printed its two informational lines

Registry: <url> (<latency>ms)
Tip: ut config set registry <url> --global

via println!, which placed them on fd 1. Anything that
consumed the command's stdout therefore picked the banner up as
data. The visible regression that motivated this PR was the
executor passthrough round-trip on macOS,

pbpaste | utx prettier --parser babel | pbcopy

where utx's contract — mirroring the contract of npx,
pnpm dlx, yarn dlx, and corepack, which are all
byte-transparent wrappers over their spawned tool's stdout — is
that the formatted source bytes flow end-to-end from the clipboard
through prettier and back into the clipboard. With the wrapper's
own auto-pick banner on fd 1, the two "Registry: ..." and
"Tip: ..." lines were prepended to prettier's stdin (which then
made prettier emit a syntax error because "Registry:" isn't valid
JavaScript) and the result of the round-trip wasn't the formatted
code at all. The same shape of failure applies to
output=\$(ut <cmd>) shell-substitution capture,
ut <cmd> > file.txt file redirection, and any
ut <cmd> --json | jq machine-readable-output pipeline.

This PR moves the two macros from println! to eprintln!, putting
the diagnostic pair on fd 2. The hint remains fully visible to
every human-facing observer of the process:

  • Interactive terminals: when nothing intercepts a process's
    streams, both fd 1 and fd 2 are wired to the controlling TTY. A
    bare ut install in a terminal therefore shows the two lines on
    screen exactly as before the change — the screen position and
    the ANSI colouring (the existing colored::Colorize .dimmed()
    / .cyan() / .green() / .yellow() decorations in
    crates/pm/src/util/registry.rs:83-90) are unchanged.
  • CI build logs: GitHub Actions, GitLab CI, Jenkins, Buildkite,
    and CircleCI runners all capture both fd 1 and fd 2 of the
    spawned process into the per-step log artifact, generally
    interleaved by wall-clock emission time so the line ordering is
    preserved. The record of which registry was auto-picked, what
    the ping latency was, and the global-config command to make the
    choice sticky is preserved in the CI artifact across this
    change.
  • tail -f / journalctl / systemd-journal: any wrap that
    splits the two streams into separately-labelled-but-equally-
    -visible records (the journald case is the canonical example,
    with the _STREAM field tagging each entry as stdout or stderr)
    shows the banner with a "stderr" tag instead of a "stdout" tag.

And the hint stops appearing in the contexts where it has no
business appearing in the first place. POSIX shell's | pipeline
operator only rewires the producer's fd 1 to the consumer's fd 0
(the consumer's fd 1 and fd 2 are inherited from the surrounding
shell, and the producer's fd 2 is also inherited unchanged); the
> redirection operator only addresses fd 1, while 2> addresses
fd 2; the \$(...) and backtick command-substitution forms only
capture fd 1, with fd 2 going to the surrounding shell's stderr
unaffected; Node's child_process.exec(cmd, callback) API splits
fd 1 and fd 2 of the spawned child into separately-named stdout
and stderr fields of the callback's result so the caller picks
which one is "the data"; Rust's std::process::Command with its
Stdio::piped() setup behaves identically; Python's subprocess
module's subprocess.run(cmd, capture_output=True).stdout is the
fd-1-only field, with .stderr the fd-2-only field, etc. All of
those see a clean fd-1 stream once the banner is moved to fd 2.

This is the convention every long-lived Unix command-line tool
follows. The canonical citations a reviewer might consult:

  • cargo prints Compiling foo v0.1.0 (...), Compiling bar v2.3.4 (https://github.com/ex/bar), Finished release [optimized] target(s) in 12.34s, Running unittests, and the
    Updating crates.io index line to stderr. The reason
    cargo metadata --format-version 1 | jq '.packages | length'
    works without any flag is that the JSON metadata goes on stdout
    while every status line goes on stderr.
  • git prints Cloning into 'foo'..., remote: Enumerating objects: 12, done., Receiving objects: 100% (12/12),
    Resolving deltas: 100% (3/3), and Already up to date. to
    stderr. The reason sha=\$(git rev-parse HEAD) produces a
    \$sha containing only the 40-character commit hash, and the
    reason git log --format='%H %s' --reverse main..feature > log.txt
    produces a log.txt containing only commit-hash + subject
    lines, is that every piece of progress chatter is on stderr
    while the actual queried data goes on stdout.
  • curl prints its progress meter (%, transferred bytes,
    ETA, transfer rate) to stderr. The reason
    curl -L https://api.github.com/repos/utooland/utoo | jq .name
    works out of the box, without the --silent flag, is that the
    HTTP response body is on stdout and the meter is on stderr. The
    -s / --silent flag's name itself is the giveaway: it
    silences the stderr-side chatter that's already on stderr, it
    doesn't redirect anything.
  • wget writes its download progress and the
    Saving to: 'foo.tar.gz' line to stderr; the -O --bound file
    contents go to stdout.
  • make writes make[1]: Entering directory '/foo/bar' and
    make[1]: Leaving directory '/foo/bar' to stderr so that a
    Makefile rule of the form
    out.txt: in.txt; tool < \$< > \$@ doesn't corrupt out.txt
    with the entering-and-leaving banner of recursive sub-makes.
  • apt / apt-get write Reading package lists... Done,
    Building dependency tree... Done, and per-package
    Setting up libfoo (1.2.3-4) ... lines to stderr.
  • openssh writes
    Warning: Permanently added 'github.com' (ED25519) to the list of known hosts. to stderr — the line is famously stderr-bound
    for a specific operational reason: scp, rsync over ssh,
    and bare-ssh-with-remote-command-and-local-stdout-capture
    (ssh remote 'cat /etc/issue' > local.txt) all rely on the
    one-time warning not landing in the data sink.
  • npm / pnpm / yarn all write their progress
    spinners, the
    added 124 packages, audited 1500 packages in 3.2s summary, the
    deprecated some-package@1.2.3: this package has been renamed to ... warnings, and the npm WARN ... advisory lines to
    stderr. The machine-readable counterparts —
    npm view foo version, npm config get registry,
    npm pack --json, pnpm list --json, yarn config get registry — go on stdout. The asymmetry is the same as in cargo
    and git: data that a script is meant to consume goes on stdout,
    everything else goes on stderr.
  • kubectl writes Warning: this is a deprecated API and
    the various error: ... lines to stderr, while
    kubectl get pod foo -o yaml and
    kubectl get pod foo -o jsonpath='{.metadata.uid}' go on
    stdout. The kubernetes-client convention's split is documented
    explicitly in the kubectl source's cmd_util package.
  • docker writes Step 1/5 : FROM alpine:3.18 and the
    Sending build context to Docker daemon 2.048 kB and
    Successfully built abc123def456ab lines from docker build
    to stderr, while docker inspect foo --format '{{.Id}}''s ID
    goes on stdout.
  • rsync writes its file-by-file transfer log and the final
    sent 12,345 bytes received 678 bytes 9.8K bytes/sec summary
    to stderr, so that rsync --list-only plus a custom
    --out-format plus stdout-grepping plays nicely in scripts.
  • tar writes its --verbose per-file listing to stderr
    when the archive content itself is going to stdout (-c -f -),
    and to stdout when the archive content is going to a file
    (-c -f foo.tar), which is the dynamic-target behaviour the
    GNU tar info-page section "Verbose output" describes.

Each of those tools made the same call for the same operational
reason: stdout is the data channel a downstream |-consumer or
>-file or \$()-substitution gets, and stderr is the
out-of-band channel for everything the human running the command
(or the log aggregator reading the journal) needs to see but the
machine doesn't. The registry-auto-select banner — "we noticed
you have no registry configured anywhere, we measured the two
well-known npm mirror endpoints, here is the one we picked and
the latency, and by the way the standard way to make the choice
durable is ut config set registry <url> --global" — is a
textbook example of the stderr side of that line. It's chatter
the tool emits on its own initiative, addressed to the operator,
about the environment the tool found itself running in. It is
not the answer to any user-issued query the way
ut config get registry is, and it is not the data output of
any user-requested subcommand the way ut install's lockfile
write or utx prettier's wrapped-tool stdout is.

Intentionally not changed

The sibling println! in crates/pm/src/service/config.rs:53
prints the current registry value as part of the ut config
listing command's output — the table-of-current-settings the
user gets back when they ask the tool what its configuration
looks like. That output is the answer to a user query, and the
supported consumption pattern for "what registry is configured?"
in a shell script is

registry="\$(ut config get registry)"

i.e., command-substitution on the tool's stdout. It belongs on
fd 1 by exactly the same Unix convention that places this PR's
auto-pick banner on fd 2: stdout is "the value the user asked
for," stderr is "the chatter the tool emitted alongside the
value." Same logic as the git rev-parse HEAD / git clone
asymmetry — the hash you asked for goes to stdout, the clone's
"Receiving objects:" progress that no one asked for goes to
stderr — and the same logic as the cargo metadata /
cargo build asymmetry: the JSON metadata you asked for goes to
stdout, the "Compiling foo v0.1.0" status of the build no one
asked for goes to stderr. So the service/config.rs:53 line
stays on println! deliberately and is the explicit
scope-boundary of this PR.

Follow-up, deliberately out of scope here

There is a deeper fix orthogonal to "which fd does the banner go
on" — the frequency with which the auto-pick banner appears.
The auto-pick path currently fires on every utoo / ut / utx
invocation that has no --registry flag, no $UTOO_REGISTRY
env var, and no registry = key in the merged config; on a
fresh developer machine, that's every command. The natural
remedy is for the success branch of select_fastest_registry
to persist the picked URL into the global config file via the
existing Config::set("registry", url, ConfigScope::Global)
API in crates/pm/src/util/config_file.rs:66-69 immediately
after the ping result has been emitted. Once the global config
gains a [values] registry = "<url>" line, the next invocation
of any utoo binary hits the "registry key in the merged
config" priority branch inside init_registry
(crates/pm/src/util/user_config.rs:44-51) and the
ping-and-print path is skipped entirely. The Tip line's
suggested command,

ut config set registry <picked-url> --global

is exactly the manual step that this auto-persist would do
implicitly — so after that change landed the Tip text would be
redundant (the action it's recommending would already have
happened) and could be dropped, leaving only the
"Registry: (ms)" line as a one-time-per-machine
informational note. Filing as a separate task so this PR
remains a one-file two-line behavioural change with a tight
scope.

A second small follow-up, also out of scope: the codebase
already has a TTY-awareness lazy static at
crates/pm/src/util/logger.rs:30,

pub static IS_TTY: Lazy<bool> =
    Lazy::new(|| std::io::stderr().is_terminal());

built on the standard library's std::io::IsTerminal trait
introduced in Rust 1.70. The same file at line 73 still has a
call into the deprecated third-party atty crate's
atty::is(atty::Stream::Stdout) from when the standard-library
trait didn't exist. Migrating the lingering atty::is(...) call
to the modern IsTerminal API would let the atty crate be
dropped from the workspace's dependency graph, which is a small
hygiene win unrelated to the stream-of-the-banner question
this PR addresses but in the same neighbourhood. The
is_terminal crate that the atty crate's README now points at
as the recommended successor is itself superseded by the
standard library since 1.70, so the right destination is the
standard-library IsTerminal trait that the same file at line
30 already uses. Filing as a separate hygiene task.

Test plan

  • cargo fmt -p utoo-pm — clean, no whitespace delta. The
    Rust formatter has nothing to do because println! and
    eprintln! are the same visual width (eight printable
    characters each, ending in !), so the column alignment of
    the comma-separated macro arguments and the parenthesis on
    the call's opening line is unchanged.

  • cargo clippy --all-targets -- -D warnings --no-deps
    the workspace-wide post-edit verification step listed in
    CLAUDE.md's "Post-Edit Verification" section. The local
    run inside this fresh Superconductor worktree fails before
    reaching the utoo-pm crate, because the next.js git
    submodule (which the supermodule's .gitmodules registers
    and which git submodule status shows in its
    "configured-but-not-checked-out" state with a leading
    --prefixed pinned commit hash
    a1f6c5c22b6ed2aea3d023a8f4a798f22c1daf65) isn't checked
    out in a fresh worktree. The workspace's cargo manifest
    resolution then fails on the path dependencies into
    next.js/turbopack/crates/turbo-bincode,
    next.js/turbopack/crates/turbo-tasks, and the other
    Turbopack crates that the pack-api, pack-core,
    pack-cli, pack-napi, and pack-schema crates in this
    repo's workspace path-depend into. The error surfaces as

    failed to read .../next.js/turbopack/crates/turbo-bincode/Cargo.toml
    No such file or directory (os error 2)
    

    The standard project-setup step
    git submodule update --init --recursive, documented in
    CLAUDE.md's "Project Overview" section, fetches the pinned
    Turbopack-bearing utooland/next.js repo into the
    next.js/ directory and unblocks cargo's workspace
    resolution. The repo's GitHub Actions workflow uses
    actions/checkout@v4 with submodules: recursive (the
    conventional submodule-aware checkout configuration), so
    CI handles the submodule initialization automatically and
    the full workspace clippy gate runs there end-to-end. The
    submodule-checkout dependency is orthogonal to the
    two-token change in this PR — clippy would fail in
    identical fashion against an unmodified next branch on a
    bare worktree without the submodule.

  • Interactive sanity — in a working directory that
    has no registry = line in ./.utoo.toml and whose
    \${XDG_CONFIG_HOME-\$HOME/.config}/.utoo.toml and
    \$HOME/.utoo/config.toml global counterparts likewise lack
    the key, and whose process environment has no
    \$UTOO_REGISTRY set, run a bare ut install in a TTY.
    Expected outcome: the Registry: <url> (<Nms>ms) and
    Tip: ut config set registry <url> --global pair appears on
    the controlling terminal in dim/cyan/green/yellow ANSI
    colouring exactly as it did before this PR, because the
    process's stderr is attached to the controlling TTY when
    the shell isn't intercepting either of its streams. To
    force this code path on a developer machine that already
    has a sticky global registry, the temporary-rename trick is
    to move the global config aside before the run:
    mv ~/.utoo/config.toml ~/.utoo/config.toml.bak, observe
    the banner, then restore. The cleaner alternative is
    running the binary against an isolated home:
    HOME=\$(mktemp -d) ut install in a project directory that
    has neither a .utoo.toml nor a package-lock.json
    though that depends on ut install's interaction with the
    absence of a lockfile, which is the lockfile-generation
    path rather than the install-from-lockfile path.

  • The original failure mode — the screenshotted
    regression that prompted this PR. Copy a JS snippet to the
    macOS clipboard,

    echo 'const   x  = 1' | pbcopy
    

    and then run the executor-passthrough round-trip,

    pbpaste | utx prettier --parser babel | pbcopy
    

    and inspect the clipboard with another pbpaste. Before
    this PR, the clipboard came back with the wrapper banner
    prepended to whatever prettier had managed to produce
    (which, depending on prettier's error-recovery behaviour
    when fed a Registry: line as the first token of the
    input, was either an empty string with prettier-on-stderr
    error noise or a partially-formatted prefix). After this
    PR, the clipboard contains just const x = 1;\n — the
    canonical prettier formatting of the input snippet —
    while the two hint lines appear on the terminal in
    between the two ends of the pipeline, in colour. The
    reason the terminal sees them: the middle stage of an
    a | b | c pipeline has its stdin connected to a's
    stdout, its stdout connected to c's stdin, and its
    stderr connected to whatever the surrounding shell wired
    fd 2 to (the controlling terminal in the bare interactive
    case). So utx's stderr is the only one of its three
    standard streams that's not consumed by a neighbouring
    pipeline stage, which is exactly where the user is meant
    to see the wrapper-emitted environment chatter.

  • Shell capture — run

    version="\$(utx tsc --version)"
    echo "\$version" | hexdump -C
    

    in a shell where utx tsc resolves to TypeScript's
    compiler binary, and inspect the byte content of
    \$version. The expected output is exactly the line
    written by tsc on its own stdout (the bytes
    Version 5.4.5\n or whatever the project's pinned TypeScript
    version is). Before this PR, with ut's registry banner on
    stdout, the captured \$version had a multi-line prefix
    containing the ANSI-coloured Registry: and Tip: lines
    ahead of the tsc-version line, which would have broken any
    script that did
    if [[ "\$version" =~ ^Version\ 5\. ]]-style gating on the
    leading characters of the captured value.

  • JSON / structured-output consumer — if the ut
    subcommand surface gains a machine-readable subcommand
    (ut pack --json, ut deps --json, or any sibling of
    npm view --json), the test for this PR's contract is
    that ut <sub> --json | jq . parses without
    parse error: Invalid numeric literal at line 1, column N
    failures. Today the relevant query subcommands
    (ut config get <key>, the ut --version flag, and the
    ut list-cache introspection) all bypass the auto-pick
    path entirely because they read configuration without
    performing a registry-touching operation, so a literal
    smoke test would need to be a registry-touching subcommand
    with a JSON output mode — file as a follow-up if such a
    subcommand is added.

  • CI build log on this PR — the
    GitHub Actions log for the PR's checkout-and-test job
    should still contain the Registry: and Tip: lines as
    it did on prior PRs that exercised the auto-pick branch.
    The GitHub Actions runner captures both fd 1 and fd 2 of
    the shell-launched commands into the per-step log
    artifact, so the visibility property — "the human reading
    the CI log later can see which registry was picked" —
    carries over from the pre-change stdout-bound emission to
    the post-change stderr-bound emission without observer
    change. If the post-change CI log were missing the
    lines, the runner's stderr capture would be misconfigured,
    which would be a much bigger problem than this PR.

  • No regression test asserts on the captured-output
    bytes of select_fastest_registry
    . The only Rust test
    in the workspace that touches the function is
    test_select_fastest_registry in the same file at
    crates/pm/src/util/registry.rs:148-153, which is
    declared

    #[tokio::test]
    #[ignore] // Requires network access
    async fn test_select_fastest_registry() {
        let registry = select_fastest_registry().await;
        assert!(registry == REGISTRY_NPMMIRROR ||
                registry == REGISTRY_NPMJS);
    }
    

    It is #[ignore]-gated because it makes live HTTP requests
    to the two registry endpoints (https://registry.npmmirror.com/-/ping
    and https://registry.npmjs.org/-/ping per the
    ping_registry helper at lines 24-40) to measure the
    latencies that the function picks the minimum of, so it's
    opt-in for the cargo test -- --ignored invocation that
    the project's CI runs as a separate suite (or doesn't, if
    the network-touching ignored tests are excluded from CI on
    hermeticity grounds — either way, the gate is whether the
    returned String equals one of the two well-known
    registry constants, and the gate has nothing to say about
    what stream the function emitted its banner on). The
    banner-text grep over the e2e shell scripts —
    grep -rIn 'Registry:\|"Tip:' e2e crates/pm/src — finds
    no matches inside e2e/utoo-pm.sh or elsewhere outside
    the same Rust file that defines the banner strings.
    Confirmed: nothing in the test surface is sensitive to
    the stdout-vs-stderr choice for these two lines.

🤖 Generated with Claude Code

`select_fastest_registry` wrote its two informational lines —
`Registry: <url> (<latency>ms)` and `Tip: ut config set registry <url>
--global` — through `println!`, putting them on fd 1 alongside the
command's actual data output. That contaminates the executor
passthrough case `pbpaste | utx prettier --parser babel | pbcopy`
(the wrapper banner gets written into the clipboard alongside
prettier's formatted source) and the same shape would also corrupt
any `$(ut <cmd>)` shell capture, `ut <cmd> > file.txt` redirection,
and `ut <cmd> --json | jq` consumer.

Route the two macros through `eprintln!` so the lines land on fd 2.
The hint stays visible to interactive terminals (a bare process has
both fd 1 and fd 2 wired to the controlling TTY) and to CI runners
(GitHub Actions, GitLab CI, Jenkins, Buildkite, and CircleCI all
capture stdout and stderr together into the build log, usually
interleaved by emission time), while pipe consumers, redirected
files, and shell-substitution captures see a clean stdout. This is
the same split that cargo uses for the "Compiling foo v0.1.0" /
"Finished release" status lines, git for "Cloning into '...'" and
"Receiving objects: 100% (12/12)", curl for its progress meter
(`curl url | jq .field` works without `--silent` for exactly this
reason), and OpenSSH for "Warning: Permanently added 'host' (RSA) to
the list of known hosts." (which is on stderr so that
`ssh remote 'cat /etc/issue' > local.txt` doesn't bake the warning
into local.txt). The Unix contract is "stdout is the value a machine
consumer is supposed to read, stderr is everything aimed at the
human or at a log aggregator," and the registry auto-pick banner
falls squarely on the second half of that line.

The sibling `println!("  Registry:        {registry}");` at
`crates/pm/src/service/config.rs:53` — the line that the `ut config`
listing command emits when the user explicitly asks the tool to
display its current configuration — stays on stdout on purpose. That
one is the answer to a user-issued query and the canonical target of
`registry=\$(ut config get registry)` shell substitution in scripts
that read configuration, the stdout-side case of the same Unix
convention. Same reasoning as `git rev-parse HEAD`'s commit hash
going to stdout while the surrounding git's clone-progress chatter
goes to stderr.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the select_fastest_registry function in crates/pm/src/util/registry.rs to use eprintln! instead of println! for displaying registry information and tips, ensuring informational output is directed to standard error. I have no feedback to provide as there were no review comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant