feat: add reconnect and process old files by tamasdomokos · Pull Request #99 · jitsi/rtcstats-server

tamasdomokos · 2022-11-16T12:18:38Z

No description provided.

andrei-gavrilescu

I know this is meant to be a draft, I've just added a few comments to consider when refactoring.

src/app.js

src/ClientSink.js

andrei-gavrilescu

Awesome work Tomi!!

src/demux.js

src/ClientSink.js

src/demux.js

src/ClientSink.js

src/dump-persister.js

src/orphan-file-helper.js

src/ws-handler.js

andrei-gavrilescu · 2022-11-25T12:08:43Z

There are a lot of awesome changes here, and we probably need to add at least some basic integration tests for the reconnect scenarios please checkout npm run integartion and try to see if they still work (most likely not).

src/ClientMessageHandler.js

src/DumpPersister.js

src/demux.js

src/features/FeatureExtractor.js

src/OrphanFileHelper.js

src/app.js

src/DumpPersister.js

src/WsHandler.js

src/ClientMessageHandler.js

src/demux.js

src/test/client.js

andrei-gavrilescu · 2023-01-27T10:59:57Z

src/test/client.js

+                assert.deepStrictEqual(parsedBody, resultTemplate);
+            } else {
+                // this is a reconnect dumpInfo is not relevant
+                logger.info('[TEST] Handling DONE event after reconnect with statsSessionId %j, body %j %j',


Why is this not relevant, the reconnect test should have the same result as the dump without the disconnect.

After the disconnect it's waiting for the reconnect before is getting processed.

andrei-gavrilescu · 2023-01-27T12:56:57Z

src/demux.js

        // Subsequent operations will be taken by services in the upper level, like upload to store and persist
        // metadata do a db.
        case 'close':
+            this.log.info('[Demux] sink closed');


In case a client reconnect happens, the client won't sent the identity data again, which means the meta object will be missing the identity information used in the app.

the startDate that's set in _sinkCreate will also not correspond with the actual start date of the session, won't this affect the rest of the application?

andrei-gavrilescu · 2023-01-27T13:16:59Z

src/test/client.js

+
+        // we need to wait a little bit before reconnecting.
+        setTimeout(() => {
+            connection.connect();


There is one problem with how reconnect is currently handled, if the client reconnects after the server decided it waited enough then the FeatureExtractor will process the resulting dump file as if it was a new session (even though identity information is missing) resulting in an overwritten s3 dump with partial information and a duplicate entry in redshift. This can also happen if the server restarts and the client lands on another server because of how haproxy is currently setup. Ideally the server or the client would identify these cases and not process them, simply logging an error would do.
Is this case handled by client/server protocol somehow?

andrei-gavrilescu · 2023-01-27T13:23:07Z

src/test/client.js

@@ -338,6 +399,13 @@ function simulateConnection(dumpPath, resultPath, ua, protocolV) {
 function runTest() {


A test that would somehow simulate a server restart would be cool, but I assume that would be a bit convoluted, maybe an isolated test for the OprhanFileHelper? wdyt?

gabiborlea · 2023-02-06T09:43:12Z

src/demux.js

+            this._validateSequenceNumber(statsSessionId, requestData.sequenceNumber, this.lastSequenceNumber);
+
+            this.lastTimestamp = requestData.timestamp;
+            this.lastSequenceNumber = requestData.sequenceNumber;


in case of the connectionInfo stats entry there is no sequenceNumber, and it might be undefined, please check if the sequenceNumber from the request data is valid

andrei-gavrilescu · 2023-02-08T09:33:46Z

src/ClientMessageHandler.js

+            sequenceNumber = this.demuxSink.lastSequenceNumber;
+        } else {
+            logger.debug('[ClientMessageHandler] Last sequence number from dump ');
+            sequenceNumber = await this._getLastSequenceNumberFromDump();


do we need to read the entire file in case of a client reconnect (not server restart case)? Technically we have that information in the previous sink

andrei-gavrilescu · 2023-02-08T09:38:33Z

src/demux.js

@@ -168,15 +189,7 @@ class DemuxSink extends Writable {
        // if the entry already exists because some other instance uploaded first, the same incremental approach needs
        // to be taken.
        while (!fd) {


if the incremental approach was removed I assume the while and the comments need to be removed as well.

andrei-gavrilescu · 2023-02-08T09:40:21Z

src/demux.js

+        let identity;
+
+        if (isReconnect) {
+            identity = await this._getIdentityFromFile(sinkData.id);


If this is a client reconnect (not server restart case), the file will be read when we try to get the last sequence number and then again here, we can probably make this more efficient and only read the file once, not insisting for this pr.

andrei-gavrilescu · 2023-02-08T09:46:28Z

src/ClientMessageHandler.js

+                    this._createSequenceNumberBody(sequenceNumber, isInitial)
+        ));
+
+        if (this.client.readyState === 1) {


what does this mean? are there cases where readystate is not 1, if such a case occurs what happens? add a comment for why this is needed.

andrei-gavrilescu · 2023-02-08T09:52:14Z

src/ClientMessageHandler.js

+                return 0;
+            });
+
+        const result = await promis;


this can be done a bit more elegantly like:

let lastLine = 0; try { const lastLineString = await storeFile.getLastLine(dumpPath, 1); lastLine = utils.parseLineForSequenceNumber(lastLineString)) } catch(e) { logger.error('[ClientMessageHandler] Error. ', e); } return lastLine;

wdyt

andrei-gavrilescu · 2023-02-08T10:09:36Z

src/ClientMessageHandler.js

+            .then(
+                lastLine => utils.parseLineForSequenceNumber(lastLine))
+            .catch(() => {
+                logger.debug('[ClientMessageHandler] New connection. File doesn\'t exist. file: ', dumpPath);


I'm assuming the error can be any error not necessarily that the file doesn't exist, maybe we should log the error as well.

andrei-gavrilescu · 2023-02-08T10:10:33Z

src/utils/utils.js

+        return jsonData[4];
+    }
+
+    return -1;


how will the client react on receiving -1, we should add some comments here.

andrei-gavrilescu · 2023-02-08T10:16:06Z

src/store/file.js

+const readline = require('readline');
+const Stream = require('stream');
+
+exports.getLastLine = (fileName, minLength) => {


I feel a bit insecure about this function, if a server restarts, which means we have about 2000 files, when clients start reconnecting we're gonna read each dump (can be up to 1gb per file) line by line not sure how that's gonna affect the server. I wonder if there are any more efficient ways to read the end of a file.

tamasdomokos force-pushed the feat/add_reconnect_and_process_old_files branch from 81b8aad to c810e84 Compare November 16, 2022 12:48

andrei-gavrilescu reviewed Nov 17, 2022

View reviewed changes

src/app.js Outdated Show resolved Hide resolved

src/app.js Outdated Show resolved Hide resolved

src/app.js Outdated Show resolved Hide resolved

src/app.js Outdated Show resolved Hide resolved

src/app.js Outdated Show resolved Hide resolved

src/app.js Outdated Show resolved Hide resolved

tamasdomokos commented Nov 24, 2022

View reviewed changes

src/ClientSink.js Outdated Show resolved Hide resolved

andrei-gavrilescu reviewed Nov 25, 2022

View reviewed changes