-
Notifications
You must be signed in to change notification settings - Fork 2.2k
sqldb+graph/db: add node related tables and implement some node CRUD #9824
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: elle-graphSQL8-prep
Are you sure you want to change the base?
sqldb+graph/db: add node related tables and implement some node CRUD #9824
Conversation
Important Review skippedAuto reviews are limited to specific labels. 🏷️ Labels to auto review (1)
Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
eac15ac
to
eae4d81
Compare
eae4d81
to
2802707
Compare
// information. | ||
// | ||
// NOTE: part of the V1Store interface. | ||
func (s *SQLStore) AddLightningNode(node *models.LightningNode, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently working on a benchmark to compare this to the performance of the kvdb version which uses the batch schedular. We may end up needing something similar here to batch these txs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this method will be updated to make use of the timed scheduler once this PR is merged: #9845
order of operations: either:
- merge 9845, then update this PR
- merge this, then 9845, then new PR to update this to use that.
All depends on what gets merged first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice!
Did a first pass, will probably need a second one since it's quite a chunk of code. But looks great so far.
node_id BIGINT NOT NULL REFERENCES nodes(id) ON DELETE CASCADE, | ||
|
||
-- The id of the feature that this node has. | ||
feature_id BIGINT NOT NULL REFERENCES features(id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, similar to today's offline discussion: What's the advantage of having the feature bits defined in a separate table? Since we don't store more information with each bit that would be de-duplicated. IMO this can just be the bit itself.
I know @Roasbeef had different opinions on such enums (or generally numbers that correspond to Golang constants). So I think we should settle this discussion once and then do the same everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this can just be the bit itself.
unlike the discussion offline today though, this is an array of feature bits. So i think the idea was if you normalise this out, then you can do "Which nodes advertise feature x" which is hard to do if you keep it in the array form
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not talking about removing the node_features
table, but just the features
. The "array" is the node_features
, which is good. But the features
table just maps a database ID to a numeric feature number. So we can just inline the feature number here, in the node_features
table (and save quite a bit of space while doing that as well).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gotcha 👍
cool - will update as suggested
feature_id BIGINT NOT NULL REFERENCES features(id) | ||
); | ||
CREATE INDEX IF NOT EXISTS node_feature_node_id_idx ON node_features(node_id); | ||
CREATE UNIQUE INDEX IF NOT EXISTS node_features_unique ON node_features ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A unique index will create a normal index. So we don't need the first one, as the second one also includes that same field (node_id
).
CREATE UNIQUE INDEX IF NOT EXISTS node_addresses_unique ON node_addresses ( | ||
node_id, type, position | ||
); | ||
CREATE INDEX IF NOT EXISTS node_addresses_node_id_idx ON node_addresses(node_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here re duplicate index for node_id
field.
|
||
-- The gossip protocols are distinct. So often we will only want to | ||
-- query for nodes that are gossiped on a specific protocol version. | ||
CREATE INDEX IF NOT EXISTS nodes_version_idx ON nodes(version); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Index not needed since we have the unique one (see other comments).
-- announcement for a channel that is connected to a node that we | ||
-- have not yet received a node announcement for. | ||
signature BLOB | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to add some checks to this table for extra validation?
For example (CHECK ((version = 0 OR VERSION =1) AND length(pub_key) == 33)
?
I think the downside is that we can't have named checks. So if we ever wanted to add checks or change them, we'd need to drop the table and re-create it, which is a bit tedious (especially with multiple foreign keys).
So I'm leaning toward no, but let's discuss.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think for version
at least we should not be strict since we expect future versions and so would need to change the checks.
Even for pubkey - im hesitant to add the length check just given that there might be a version in which we start advertising schnorr (x-only) pub keys which are 32 bytes
sqldb/sqlc/queries/graph.sql
Outdated
) | ||
ON CONFLICT (pub_key, version) | ||
DO UPDATE SET | ||
alias = EXCLUDED.alias, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since these ON CONFLICT ... DO UPDATE
statements are a bit hard to read sometimes, I made it a habit of adding a comment above what the expectation is. For example, the above on the feature bit is a no-OP because the bit field is part of the unique index.
But here we update the fields that aren't covered by the unique index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok yeah - good idea - will add!
sqldb/sqlc/queries/graph.sql
Outdated
-- name: GetNodeFeaturesByPubKey :many | ||
SELECT f.bit | ||
FROM nodes n | ||
JOIN node_features nf ON nf.node_id = n.id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: indentation of JOIN doesn't match above statement.
graph/db/sql_store.go
Outdated
} | ||
|
||
if node.HaveNodeAnnouncement { | ||
params.LastUpdate = sql.NullInt64{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use sqldb.SQLInt64()
, SQLStr()
and SQLTime()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh cool! TIL - thanks.
just a quick check: i've opted to have LastUpdate as an int64 in the DB since it is the unix timestamp advertised in the node announcement so we want to be sure it is not changed somehow by timezone etc.
just want to check if reviewers are ok with that part?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me. I assume precision down to one second is enough here? Otherwise we'd need to use UnixNano()
in which case we should make it very clear in the documentation that it's nanoseconds and not seconds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah it is advertised in seconds in the announcement 👍
sqldb/sqlc/queries/graph.sql
Outdated
-- name: GetNodesByLastUpdateRange :many | ||
SELECT * | ||
FROM nodes | ||
WHERE last_update >= sqlc.arg(start_time) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: just use @start_time
and @end_time
.
2802707
to
0e86ef1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated, thanks @guggero!
as noted offline, i'll continue to iterate on the node_addrs
schema to see if we can improve update efficiency there. I'll work on this in parallel with the upcoming PRs - the migrations are not available in prod build so should not be an issue.
depending on what gets approved first, i'll also update the AddLightningNode here to use the batcher: #9845
-- announcement for a channel that is connected to a node that we | ||
-- have not yet received a node announcement for. | ||
signature BLOB | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think for version
at least we should not be strict since we expect future versions and so would need to change the checks.
Even for pubkey - im hesitant to add the length check just given that there might be a version in which we start advertising schnorr (x-only) pub keys which are 32 bytes
node_id BIGINT NOT NULL REFERENCES nodes(id) ON DELETE CASCADE, | ||
|
||
-- The id of the feature that this node has. | ||
feature_id BIGINT NOT NULL REFERENCES features(id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gotcha 👍
cool - will update as suggested
// information. | ||
// | ||
// NOTE: part of the V1Store interface. | ||
func (s *SQLStore) AddLightningNode(node *models.LightningNode, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this method will be updated to make use of the timed scheduler once this PR is merged: #9845
order of operations: either:
- merge 9845, then update this PR
- merge this, then 9845, then new PR to update this to use that.
All depends on what gets merged first
graph/db/sql_store.go
Outdated
} | ||
|
||
if node.HaveNodeAnnouncement { | ||
params.LastUpdate = sql.NullInt64{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh cool! TIL - thanks.
just a quick check: i've opted to have LastUpdate as an int64 in the DB since it is the unix timestamp advertised in the node announcement so we want to be sure it is not changed somehow by timezone etc.
just want to check if reviewers are ok with that part?
sqldb/sqlc/queries/graph.sql
Outdated
) | ||
ON CONFLICT (pub_key, version) | ||
DO UPDATE SET | ||
alias = EXCLUDED.alias, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok yeah - good idea - will add!
|
||
CREATE TABLE IF NOT EXISTS source_nodes ( | ||
node_id BIGINT PRIMARY KEY REFERENCES nodes (id) ON DELETE CASCADE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isnt that only the case if we want this to be auto incrementing? we defs dont want this to auto increment.
|
||
CREATE TABLE IF NOT EXISTS source_nodes ( | ||
node_id BIGINT PRIMARY KEY REFERENCES nodes (id) ON DELETE CASCADE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the reason i added PRIMARY KEY is just to make sure this is unique - but I guess could also remove that and just use UNIQUE
to avoid confusion?
0e86ef1
to
f89c2f2
Compare
f89c2f2
to
32d4287
Compare
ok i'll update this now to use the generic batch scheduler now that that is merged. Will do that before re-requesting |
In this commit, the various SQL schemas required to store graph node related data is defined. Specifically, the following tables are defined: - nodes - node_extra_types - node_features - node_addresses
In this commit, we add the various sqlc queries that we need in order to implement the following V1Store methods: - AddLightningNode - FetchLightningNode - HasLightningNode - AddrsForNode - DeleteLightningNode - FetchNodeFeatures These are implemented by SQLStore which then lets us use the SQLStore backend for the following unit tests: - TestNodeInsertionAndDeletion - TestLightningNodePersistence
In this commit, we let the SQLStore implement LookupAlias. This then lets us run the TestAliasLookup unit test against the SQL backends.
In this commit we add the necessary SQL queries and then implement the SQLStore's NodeUpdatesInHorizon method. This lets us run the TestNodeUpdatesInHorizon unit tests against SQL backends.
In this commit, we add the `source_nodes` table. It points to entries in the `nodes` table. This table will store one entry per protocol version that we are announcing a node_announcement on. With this commit, we can run the TestSourceNode unit test against our SQL backends.
32d4287
to
ef435a7
Compare
Builds on top of #9853
In this PR, the following schemas are defined for storing graph nodes:
(NOTE: the
block_height
column is only shown as an example to show how v2 data might fit into this schema. It is not included in the initial schema definition).nodes
: stores the main info about the node from a node_announcement. Note that we might sometimes insert a "shell node" for when we get a channel announcement that points to a node that we dont yet have a record for. In this case, we will only have the public key of the node along with the gossip version it was advertised on. So theversion
:pub_key
pair is always unique.node_addresses
: normalised node addresses.node_extra_types
: stores the normalised TLV records of anode_announcement
for any fields that we dont explicitly store in thenodes
table.node_features
: stores the normalised features advertised in a node announcement.source_nodes
: we store a pointer to an entry in thenodes
table to indicate which node belongs to this node. NOTE: there will be an entry here per gossip protocol that we are aware of.The following methods are implemented on the
SQLStore
:This then lets us run a number of existing unit tests against the new SQLStore backend. Namely:
Part of #9795