Skip to content

added batch logging#5

Open
AceTheNinja wants to merge 16 commits intobitfinexcom:mainfrom
AceTheNinja:main
Open

added batch logging#5
AceTheNinja wants to merge 16 commits intobitfinexcom:mainfrom
AceTheNinja:main

Conversation

@AceTheNinja
Copy link

No description provided.

index.js Outdated
this._errorBatchTimer = null
}

this._processBatchedErrors().catch(err => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please change linting here

README.md Outdated
"env": "development",
"errorBatching": { // optional
"interval": 60000,
"maxSize": 50,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to compose and send 50x4000 symbols to slack at once?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can create and send a file in that case

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no files

index.js Outdated
* @param {Object} payload - Payload to log
* @param {...any} extra - Additional information to log
*/
async batchLogErrorToSlack (reqChannel, err, functionName, payload, ...extra) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
async batchLogErrorToSlack (reqChannel, err, functionName, payload, ...extra) {
async logErrorEnqueue (reqChannel, err, ...extra) {

index.js Outdated
errorEntry.lastSeen = now
errorEntry.payloads.push(payload)

if (this._errorBatch.size >= this._errorBatchingConfig.maxSize) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think we should flush if buffer is full. We should evict oldest keys. Maybe it would be better to use lru here

index.js Outdated
message += `*Summary:* ${totalErrors} errors across ${errors.length} types (${timeRange})\n\n`

let truncated = false
for (const error of errors.slice(0, 10)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we slice?

Copy link
Author

@AceTheNinja AceTheNinja Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the message is too large the API will fail so have kept it to only 10 different error per function

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens to the rest?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question still stands. I suggest we log only last to slack to fulfill length requirement

for (const error of errors.slice(0, 10)) {
if (truncated) break

message += `• *${error.errorMessage}* (${error.count}x)\n`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about the rest message info? like extra, etc.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that goes in payload

index.js Outdated
if (opts.conf) this.conf = opts.conf

this._errorBatchingConfig = {
interval: this.conf.errorBatching.interval || 60000,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's assume errorBatching may be absent completely cause we're not going to update all existing running places

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added null coalesce

index.js Outdated
* @param {Object} payload - Payload to log
* @param {...any} extra - Additional information to log
*/
async logErrorEnqueue (reqChannel, err, functionName, payload, ...extra) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's make parameter functionName more generalized, i.e. errorOrigin or originName or errorSource or sourceName etc. You can come up with something better.

* @param {Error} err - Error to log
* @param {string} functionName - Name of the function where the error occurred
* @param {Object} payload - Payload to log
* @param {...any} extra - Additional information to log
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is extra a required parameter?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no we're just putting all other params in 1 object. It can be undefined also

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @param {...any} extra - Additional information to log
* @param {...any} [extra] - Additional information to log

let errorEntry = this._errorBatch.get(errorKey)

if (!errorEntry) {
errorEntry = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra is not used, what is it for then?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added it

index.js Outdated

if (opts.conf) this.conf = opts.conf

this._errorBatchingConfig = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we store it again if we already have it in this.conf?
additionally, if it's omitted (in readme it's commented as optional), you gonna get error trying to access this.conf.errorBatching
if there is no this.conf.errorBatching, this instance is not using batching, we should not setup it. I think additional timer is an overhead cause not all workers need batching

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed extra variable and not setting up batch logging interval if config does not exist

index.js Outdated
maxMessageLength: this.conf.errorBatching?.maxMessageLength || 4000
}

this._errorBatch = new LRU({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there is no this.conf.errorBatching, we should not setup it

index.js Outdated
max: this._errorBatchingConfig.maxSize
})

this._initErrorBatching()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there is no this.conf.errorBatching, we should not setup it

* @param {Error} err - Error to log
* @param {string} functionName - Name of the function where the error occurred
* @param {Object} payload - Payload to log
* @param {...any} extra - Additional information to log
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @param {...any} extra - Additional information to log
* @param {...any} [extra] - Additional information to log

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls create folder test and put a file there named facility.test.js

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

package.json Outdated
"lru": "^3.1.0"
},
"devDependencies": {
"@types/jest": "^30.0.0",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needed it for local syntax highlighting. Removed it

package.json Outdated
"dependencies": {
"bfx-facs-base": "git+https://github.com/bitfinexcom/bfx-facs-base.git"
"bfx-facs-base": "git+https://github.com/bitfinexcom/bfx-facs-base.git",
"lru": "^3.1.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's lock dep and add package-lock.json (use node 16.20)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

index.js Outdated
const earliest = new Date(Math.min(...allTimes))
const latest = new Date(Math.max(...allTimes))

const formatTime = (date) => date.toISOString().substring(11, 19)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like a utility function that should not be re-defined inside a function each time

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to utils

index.js Outdated
await this._sendBatchedErrorMessage(reqChannel, sourceName, errors)
}

this._errorBatch.clear()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if _sendBatchedErrorMessage fails, we never clear?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now it should, added clear to finally block

- batching init is now conf based
- moved test to test/facility.test.js
- clear batch if sending fails so error don't pile up
@AceTheNinja
Copy link
Author

#5 (comment)

cannot have optional parameter with spread operator. If no args are present it will be empty

package.json Outdated
"dependencies": {
"bfx-facs-base": "git+https://github.com/bitfinexcom/bfx-facs-base.git"
"bfx-facs-base": "git+https://github.com/bitfinexcom/bfx-facs-base.git",
"lru": "3.1.0"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use facs-lru ?

Copy link
Author

@AceTheNinja AceTheNinja Oct 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding another fac that implements base and starting it in the constructor felt like an unnecessary complication for something simple like this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AceTheNinja let's add an optional lru fac dependency for slack for the cases when we enable batch logging

}

try {
await this._processBatchedErrors()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this is called on stop? wouldn't it hang for a long time possibly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it's called to flush the last batch of errors, otherwise it'll be lost
not sure about long time, i think it should be reasonable time since it's an api call. We can limit the call / stop function in time and exit abruptly if we exceed graceful exit limit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use strict missing

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pushed an update


if (opts.conf) this.conf = opts.conf

if (this.conf.errorBatching && opts.lru) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we using lru and not map? with lru we would lose data right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think in order to keep the buffer fixed in length. If a worker starts to spam, we don't want to keep the spam rather than track let's say last 100 messages of a certain type.
for instance, a worker sends error messages for different occasions, i.e. API call failure and processing issue. Let's say it reports 1 api call failure and 1000 processing issues, We want both types to be reported, however we don't want 1000 processing issues (same type issue) being reported, we cut it down to, let's say, 10 only

that was the idea behind this update in general. If a worker starts spamming, this logic would still allow us to know about the problem, however, would cut down api calls and payload significantly.

anyways, if i missed the purppose, @AceTheNinja please clarify

Copy link
Author

@AceTheNinja AceTheNinja Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couldn't have explained it better 🫡 @batrudinych

index.js Outdated
}

_createErrorKey (reqChannel, err, sourceName) {
const errorMsg = err?.message || 'Unknown error'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check this

maybe you compute hash of stringified error instead of using it directly?


errorEntry.count++
errorEntry.lastSeen = now
errorEntry.payloads.push({ payload, extras: extra })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're adding same stuff twice if you just initialized cache entry.

btw if we spam under same errorKey, aren't we bloating the payloads array? Size should be restricted, otherwise we can run out of memory

index.js Outdated
payloads: [
{ payload, extras: extra }
],
count: 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it 0? you just seen it first time

index.js Outdated
}
}

_createErrorKey (reqChannel, err, sourceName) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make a default value for sourceName

index.js Outdated
message += payloadStr
}

if (error.payloads.length > 3) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls no magic numbers

index.js Outdated
}
}

if (errors.length > 10 && !truncated) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no magic numbers

index.js Outdated
}

async _sendBatchedErrorMessage (reqChannel, sourceName, errors) {
const totalErrors = errors.reduce((sum, error) => sum + error.count, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can keep the counter separately to avoid computing it here, 1 less array pass

{ payload, extras: extra }
],
count: 0,
firstSeen: now,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like you use these dates just to know when you got first and last errors. Use separate properties, do not store inside the object and avoid excessive array passes

package.json Outdated
"bfx-facs-base": "git+https://github.com/bitfinexcom/bfx-facs-base.git"
},
"optionalDependencies": {
"bfx-facs-lru": "git+https://github.com/bitfinexcom/bfx-facs-lru.git"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need it here? it should come from outside as dependency when we instantiate the fac, i think

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants