Skip to content

Failure to grant database privileges in CockroachDB on startup #1304

Open
@bnaecker

Description

@bnaecker

I saw an internal server error installing the control plane this evening. I built and installed Omicron using omicron-package, building from a local commit merging #1298 and the current main, which is at 154a4a6. From a fresh reboot I:

  • pfexec ./tools/create_virtual_hardware.sh
  • cargo build --bin omicron-package --release
  • ./target/release/omicron-package package
  • pfexec ./target/release/omicron-package install

Looking at the sled agent logs, I saw:

[2022-06-29T03:33:29.08339844Z]  INFO: SledAgent/StorageManager/6346 on feldspar: CRDB is online
[2022-06-29T03:33:29.084297094Z]  INFO: SledAgent/StorageManager/6346 on feldspar: Formatting CRDB
[2022-06-29T03:33:35.983551762Z]  INFO: SledAgent/StorageManager/6346 on feldspar: halt_and_remove_logged: Previous zone state: Running
    zone: oxz_cockroachdb_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b
[2022-06-29T03:33:35.986214134Z]  INFO: SledAgent/StorageManager/6346 on feldspar: Stopped and uninstalled zone
    zone: oxz_cockroachdb_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b
[2022-06-29T03:33:35.993636362Z]  WARN: SledAgent/StorageManager/6346 on feldspar: failed to notify nexus about datasets, will retry in 327.794992ms
[2022-06-29T03:33:35.994011147Z]  WARN: SledAgent/StorageManager/6346 on feldspar: failed to notify nexus, will retry in 1.07985318s
[2022-06-29T03:33:35.994340186Z]  WARN: SledAgent/StorageManager/6346 on feldspar: failed to notify nexus about datasets, will retry in 471.061865ms
[2022-06-29T03:33:35.994666535Z]  WARN: SledAgent/StorageManager/6346 on feldspar: failed to notify nexus about datasets, will retry in 488.023497ms
[2022-06-29T03:33:35.995017057Z]  WARN: SledAgent/StorageManager/6346 on feldspar: failed to notify nexus, will retry in 925.598032ms
[2022-06-29T03:33:35.995311707Z]  INFO: SledAgent/dropshot (SledAgent)/6346 on feldspar: request completed (req_id=ed6c07d9-c91a-44ec-aa51-d7948d830391, uri=/filesystem, method=PUT, remote_addr=[fd00:1122:3344:101::1]:37061, local_addr=[fd00:1122:3344:101::1]:12345, error_message_external="Internal Server Error", response_code=500)
    error_message_internal: Error managing storage: Error running command in zone 'oxz_cockroachdb_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b': Command [/usr/sbin/zlogin oxz_cockroachdb_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b /opt/oxide/cockroachdb/bin/cockroach sql --insecure --host [fd00:1122:3344:101::2]:32221 --file /opt/oxide/cockroachdb/sql/dbinit.sql] executed and failed with status: exit status: 1. Stdout: CREATE DATABASE
    CREATE ROLE
    GRANT
    CREATE TABLE
    CREATE TABLE
    CREATE INDEX
    CREATE TYPE
    CREATE TABLE
    CREATE INDEX
    CREATE TABLE
    CREATE TYPE
    CREATE TABLE
    CREATE INDEX
    CREATE TABLE
    CREATE INDEX
    CREATE TABLE
    CREATE TYPE
    CREATE TABLE
    CREATE INDEX
    CREATE TABLE
    CREATE INDEX
    CREATE TYPE
    CREATE TABLE
    CREATE INDEX
    CREATE INDEX
    CREATE TABLE
    , Stderr: NOTICE: GRANT INSERT, SELECT, UPDATE, DELETE ON DATABASE is deprecated.
    This statement was automatically converted to USE omicron; ALTER DEFAULT PRIVILEGES GRANT INSERT, SELECT, UPDATE, DELETE ON TABLES TO omicron;
    Please use ALTER DEFAULT PRIVILEGES going forward
    ERROR: driver: bad connection
    ERROR: driver: bad connection
    Failed running "sql"

[2022-06-29T03:33:35.996129256Z]  WARN: SledAgent/RSS/6346 on feldspar: failed to create filesystem (error="Error Response")
[2022-06-29T03:33:36.64487338Z]  INFO: SledAgent/RSS/6346 on feldspar: creating new filesystem: DatasetEnsureBody { id: 4d08fc19-3d5f-4f6b-9c48-925f8eac7255, zpool_id: d462a7f7-b628-40fe-80ff-4e4189e2d62b, dataset_kind: CockroachDb { all_addresses: [[fd00:1122:3344:101::2]:32221] }, address: [fd00:1122:3344:101::2]:32221 }
[2022-06-29T03:33:36.647081411Z]  INFO: SledAgent/StorageManager/6346 on feldspar: add_dataset: NewFilesystemRequest { zpool_id: d462a7f7-b628-40fe-80ff-4e4189e2d62b, dataset_kind: CockroachDb { all_addresses: [[fd00:1122:3344:101::2]:32221] }, address: [fd00:1122:3344:101::2]:32221, responder: Sender { inner: Some(Inner { state: State { is_complete: false, is_closed: false, is_rx_task_set: true, is_tx_task_set: false } }) } }
[2022-06-29T03:33:36.647918134Z]  INFO: SledAgent/StorageManager/6346 on feldspar: Ensuring dataset oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b/cockroachdb exists
[2022-06-29T03:33:36.66757923Z]  INFO: SledAgent/StorageManager/6346 on feldspar: Ensuring zone for oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b/cockroachdb is running
[2022-06-29T03:33:36.679000179Z]  INFO: SledAgent/StorageManager/6346 on feldspar: Zone for oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b/cockroachdb was not found
[2022-06-29T03:33:36.695774353Z]  INFO: SledAgent/StorageManager/6346 on feldspar: Configuring new Omicron zone: oxz_cockroachdb_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b
[2022-06-29T03:33:36.715740835Z]  INFO: SledAgent/StorageManager/6346 on feldspar: Installing Omicron zone: oxz_cockroachdb_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b
[2022-06-29T03:33:39.927704567Z]  INFO: SledAgent/StorageManager/6346 on feldspar: Zone booting
    zone: oxz_cockroachdb_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b
[2022-06-29T03:33:41.969687215Z]  INFO: SledAgent/StorageManager/6346 on feldspar: Adding address: Static(V6(Ipv6Network { addr: fd00:1122:3344:101::2, prefix: 64 }))
    zone: oxz_cockroachdb_oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b
[2022-06-29T03:33:42.536102317Z]  INFO: SledAgent/StorageManager/6346 on feldspar: start_zone: Loading CRDB manifest
[2022-06-29T03:33:42.905934849Z]  INFO: SledAgent/StorageManager/6346 on feldspar: start_zone: setting CRDB's config/listen_addr: [fd00:1122:3344:101::2]:32221
[2022-06-29T03:33:42.94067554Z]  INFO: SledAgent/StorageManager/6346 on feldspar: start_zone: setting CRDB's config/store
[2022-06-29T03:33:42.971693725Z]  INFO: SledAgent/StorageManager/6346 on feldspar: start_zone: setting CRDB's config/join_addrs
[2022-06-29T03:33:42.998374003Z]  INFO: SledAgent/StorageManager/6346 on feldspar: start_zone: refreshing manifest
[2022-06-29T03:33:43.023893849Z]  INFO: SledAgent/StorageManager/6346 on feldspar: start_zone: enabling CRDB service
[2022-06-29T03:33:43.047899489Z]  INFO: SledAgent/StorageManager/6346 on feldspar: start_zone: awaiting liveness of CRDB
[2022-06-29T03:33:43.05261359Z]  WARN: SledAgent/6346 on feldspar: failed to contact nexus, will retry in 1.316873734s
[2022-06-29T03:33:43.052927673Z]  WARN: SledAgent/StorageManager/6346 on feldspar: cockroachdb not yet alive
[2022-06-29T03:33:43.251247719Z]  WARN: SledAgent/StorageManager/6346 on feldspar: cockroachdb not yet alive
[2022-06-29T03:33:43.520264543Z]  WARN: SledAgent/StorageManager/6346 on feldspar: cockroachdb not yet alive
[2022-06-29T03:33:44.370186624Z]  INFO: SledAgent/6346 on feldspar: contacting server nexus, registering sled: fb0f7546-4d46-40ca-9d56-cbb810684ca7
[2022-06-29T03:33:44.472806569Z]  WARN: SledAgent/StorageManager/6346 on feldspar: cockroachdb not yet alive
[2022-06-29T03:33:45.822259473Z]  INFO: SledAgent/StorageManager/6346 on feldspar: CRDB is online
[2022-06-29T03:33:45.823833209Z]  INFO: SledAgent/StorageManager/6346 on feldspar: Formatting CRDB
[2022-06-29T03:33:58.271956658Z]  INFO: SledAgent/StorageManager/6346 on feldspar: Formatting CRDB - Completed

The relevant bit there is the long list of database permissions. The sled agent starts up CRDB, then runs dbinit.sql to initialize the database and schema. That includes a statement granting permissions to operate on the tables in the database. That syntax appears outdated as of cockroachdb/cockroach#73065. The sled agent apparently gets a non-zero exit code from the subprocess running that SQL file, since it tears down the zone and tries again. The second attempt succeeds. I'm not sure why there would be a difference.

We should probably adopt the recommended action which is to use ALTER DEFAULT PRIVILEGES instead.

Metadata

Metadata

Assignees

Labels

databaseRelated to database access

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions