Improve retryable monitoring: Notion sync + Slack alerts#49
Improve retryable monitoring: Notion sync + Slack alerts#49mahsamoosavi wants to merge 28 commits intomainfrom
Conversation
config.json
Outdated
| "orbitRpcUrl": "https://sepolia-rollup.arbitrum.io/rpc", | ||
| "explorerUrl": "https://sepolia.arbiscan.io", | ||
| "parentExplorerUrl": "https://sepolia.etherscan.io", | ||
| "parentRpcUrl": "https://mainnet.infura.io/v3/ccf7fbdf6b99420ab8e3b072f6c4ed1e", |
There was a problem hiding this comment.
did you mean to commit this?
There was a problem hiding this comment.
NO :((
Will reset the key :'(
packages/retryable-monitor/index.ts
Outdated
| } catch (error) { | ||
| console.error( | ||
| `Error getting the latest block: ${(error as Error).message}` | ||
| const currentBlock = await parentChainProvider.getBlockNumber() |
There was a problem hiding this comment.
should we be removing the error handling?
There was a problem hiding this comment.
Good catch! I put the try/catch back around getBlockNumber() to match what's currently in main. Should help avoid any unexpected crashes if the RPC call fails.
packages/retryable-monitor/index.ts
Outdated
| } | ||
| const status = await retryableMessage.status() | ||
| const notionStatus = 'Untriaged' | ||
|
|
There was a problem hiding this comment.
are we sure that we dont want to check status before doing all of this work and writing it to the database?
do we want to write every retryable to the db?
There was a problem hiding this comment.
We dont! I had left it that way while testing and just forgot to add the check. Will update it so it only writes unredeemed tickets to the db.
| // Continuously check retryables with periodic sweeps | ||
|
|
||
| // Function to continuously check retryable transactions | ||
| const checkRetryablesContinuous = async ( |
There was a problem hiding this comment.
we shouldn't need to change this function.
we run the monitoring as individual processes in action runners. it isn't a long running process.
I might suggest that it would be better to just remove this entirely.
There was a problem hiding this comment.
Hm I’d lean toward keeping it since part of the goal was to let RaaSes or other teams run this as a monitor outside GH Actions too, it might be useful for longer-running setups. What do you think?
There was a problem hiding this comment.
It's simple enough to wrap the process in a cron system of their choice.
I don't have a super strong opinion either way. It is nice to maintain less code. 😄
packages/retryable-monitor/index.ts
Outdated
| parentRpcUrl: childChain.parentRpcUrl, | ||
| })) | ||
| ) | ||
| const checkNotionForExpiringRetryables = async () => { |
There was a problem hiding this comment.
we could move this into our new notion folder
TucksonDev
left a comment
There was a problem hiding this comment.
I'm halfway through the index file, but I prefer to stop the review here to avoid leaving a big amount of comments 😅 .
Besides a few nitpicks and potential issues, in general I feel like this PR does a huge amount of refactoring that could be done in a different PR, since the refactoring is not related to the changes that we want to add to this PR.
Also, many comments have been removed that seem useful for future contributors. I'd commented on some of the removed comments but haven't reviewed all of them, so my feedback is that no comments should be removed unless the logic that they are describing changes.
In general, this PR seems to be adding/removing unrelated comments and refactoring code unrelated to this new functionality. I'd prefer to include only changes that are relevant to the new functionality to focus only on that, and have the refactoring done in a different PR 🙏
Happy to discuss further on Slack if needed!
| erc20.decimals(), | ||
| ]) | ||
|
|
||
| const nativeTokenAmount = ethers.utils.formatUnits(ticket.deposit, decimals) |
There was a problem hiding this comment.
The console.log below suggests that this is the l2callvalue defined in the ticket, but you're using the deposit here (maxSubmission + gasLimit*gasPriceBid)
There was a problem hiding this comment.
I took this from the original monitor but fully agreed the label’s a bit misleading. I’ll update it to say total deposit instead of callvalue.
There was a problem hiding this comment.
Thanks! I'll cc @dewanshparashar here to follow up on the original monitor (which I guess has the same text)
packages/retryable-monitor-slack-notion/notion/alertUntriagedRetryables.ts
Outdated
Show resolved
Hide resolved
packages/retryable-monitor-slack-notion/notion/alertUntriagedRetryables.ts
Show resolved
Hide resolved
packages/retryable-monitor-slack-notion/notion/alertUntriagedRetryables.ts
Outdated
Show resolved
Hide resolved
TucksonDev
left a comment
There was a problem hiding this comment.
Thanks @mahsamoosavi ! I've reviewed only the core logic of the changes related to Notion, I'll leave the rest of the code to the FS team.
Requesting a few changes here and there. Happy to discuss any of those!
TucksonDev
left a comment
There was a problem hiding this comment.
Thanks @mahsamoosavi !
Resolved all old and solved comments, and left follow ups on 2 of them. I'll approve once we solved those 💪
packages/retryable-monitor-slack-notion/notion/alertUntriagedRetryables.ts
Show resolved
Hide resolved
Thanks @TucksonDev! Addresses them above, please let me know if the adjustments look good |
TucksonDev
left a comment
There was a problem hiding this comment.
LGTM, thanks @mahsamoosavi !
This PR introduces a new Notion + Slack flow for improved retryable ticket monitoring, triage preservation, and alerting (closes INT-94)
1. Notion Sync with Status Preservation (--writeToNotion)
Each detected retryable ticket is synced to a Notion database.
If the ticket does not exist, a new row is created with:
If the ticket already exists:
2. Slack Alerts for Expiring Retryables (--enableAlerting)
Every 24h, the monitor sweeps the Notion DB.
For each ticket:
Tickets marked "Resolved" or "Expired" are ignored.
3. Automatic Expiry Marking
During the same 24h sweep, tickets that are:
This helps keep the Notion database clean and avoids stale tickets lingering.