Skip to content
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion .github/workflows/build-publish-npm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,10 @@ jobs:

- run: make patch-upstream lib-build

- run: cp -v README.md docs ./socket.io-serverless/

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

⚠️ Potential issue

Remove docs directory from copy command as it's not meant to be published

The files array in socket.io-serverless/package.json explicitly lists what should be included in the published package, and the docs directory is not included. The workflow should only copy README.md:

- - run: cp -v README.md docs ./socket.io-serverless/
+ - run: cp -v README.md ./socket.io-serverless/
🔗 Analysis chain

Verify docs directory inclusion in package files.

The workflow now copies both README.md and the docs directory, but the package.json files array doesn't include the docs directory. This means the docs won't be included in the published package.

Either:

  1. Add "docs" to the files array in package.json if you want to include the documentation in the published package, or
  2. Remove "docs" from the copy command if it's not meant to be published.

Let me gather more information to verify the package.json files configuration.


Let me check the contents of the docs directory and the package.json to better understand the publishing configuration.


Let me check the package.json in the socket.io-serverless directory since that's where the files are being copied to.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify if docs are meant to be published by checking for references
rg -l "docs" package.json README.md

Length of output: 45


Script:

#!/bin/bash
# Check package.json files array and content
cat package.json | jq '.files'

# Check if docs directory exists and its contents
fd -t d "^docs$"

Length of output: 60


Script:

#!/bin/bash
# List contents of docs directory
ls -la docs/

# Check if there's a .npmignore file that might affect publishing
fd -t f "^.npmignore$"

# Show more package.json content to understand publishing configuration
cat package.json | jq '{name, main, files, publishConfig}'

Length of output: 458


Script:

#!/bin/bash
# Check the package configuration in the target directory
cat socket.io-serverless/package.json | jq '{name, main, files, publishConfig}'

# Check if there's a .npmignore in the target directory
ls -la socket.io-serverless/.npmignore 2>/dev/null || echo "No .npmignore found"

Length of output: 357

- uses: JS-DevTools/npm-publish@v3
with:
token: ${{ secrets.NPM_TOKEN }}
dry-run: ${{ !(github.event_name == 'push' && startsWith(github.ref, 'refs/tags/')) }}
package: ./socket.io-serverless/package.json
package: ./socket.io-serverless/package.json
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -28,3 +28,4 @@ dist-ssr
.yarn
*.tgz
/socket.io-serverless/README.md
/socket.io-serverless/docs
24 changes: 0 additions & 24 deletions INTERNAL.md

This file was deleted.

2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

A custom [socket.io](https://socket.io/) build for serverless environments. Currently [Cloudflare Worker + Durable Objects](https://developers.cloudflare.com/durable-objects/).

Demo client app: [sio-serverless-demo-client](https://sio-serverless-demo-client.ihate.work) running `demo-client/` `demo-server/` code in this repo.
Demo client app: [sio-serverless-demo-client](https://sio-serverless-demo-client.ihate.work) running `demo-client/` `demo-server/` code in [source code repo](https://github.com/jokester/socket.io-serverless)

## Getting started

Expand Down
10 changes: 10 additions & 0 deletions docs/development.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
### how the code is developed and built

Because socket.io does not publish original TS code in NPM, I included the `socket.io` repo ([now a monorepo too](https://github.com/socketio/socket.io/issues/3533)) as a git submodule. My monorepo therefore contains packages like `socket.io-serverless` `socket.io/packages/socket.io` ``socket.io/packages/engine.io` `

Some socket.io code need to be patched, including export map in `package.json`. The patches are contained in the monorepo and applied by Makefile.

`esbuild` bundle `socket.io-serverless` code , along with Socket.io and other deps, into a non minified bundle.

A `esbuild` [build script](https://github.com/jokester/socket.io-serverless/blob/main/socket.io-serverless/build.mjs) is used to customize the deps resolution process. Some npm packages are replaced with CF-compatible implementation (like `debug`), or simple stubbed (like `node:http` ).

50 changes: 50 additions & 0 deletions docs/how-it-works.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
### how socket.io works

Socket.io (the top level library) have 2 main components: npm packages `socket.io` and `engine.io`.

The `socket.io` packages deals with the high level concepts: namespace / room / clustering / etc. It depends on `engine.io` which holds a `http.Server` instance and deals with the transport-aware logic.

In Node.js the 2 components just run in the same process, communicate with a event emitter API.
Comment on lines +1 to +7
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Enhance technical accuracy and clarity of Socket.io architecture explanation

The current explanation needs improvement in several areas:

  1. Grammar: "Socket.io packages have" instead of "have", "communicates with an event emitter" instead of "communicate with a event emitter"
  2. Technical accuracy: Add details about the Engine.IO protocol, transport upgrades, and the relationship between Socket.IO server and client

Consider this revision:

-Socket.io (the top level library) have 2 main components: npm packages `socket.io` and `engine.io`.
-
-The `socket.io` packages deals with the high level concepts: namespace / room / clustering / etc. It depends on `engine.io` which holds a `http.Server` instance and deals with the transport-aware logic.
-
-In Node.js the 2 components just run in the same process, communicate with a event emitter API.
+Socket.IO (the top-level library) has two main components: the `socket.io` and `engine.io` packages.
+
+The `socket.io` package handles high-level concepts like namespaces, rooms, and clustering. It builds upon `engine.io`, which manages the underlying transport protocol, including:
+- Transport selection and upgrades (polling → WebSocket)
+- Connection state management
+- Packet encoding/decoding
+
+In a Node.js environment, these components operate within the same process, communicating through an event emitter API. The `engine.io` server attaches to an HTTP server instance to handle both HTTP long-polling and WebSocket connections.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
### how socket.io works
Socket.io (the top level library) have 2 main components: npm packages `socket.io` and `engine.io`.
The `socket.io` packages deals with the high level concepts: namespace / room / clustering / etc. It depends on `engine.io` which holds a `http.Server` instance and deals with the transport-aware logic.
In Node.js the 2 components just run in the same process, communicate with a event emitter API.
### how socket.io works
Socket.IO (the top-level library) has two main components: the `socket.io` and `engine.io` packages.
The `socket.io` package handles high-level concepts like namespaces, rooms, and clustering. It builds upon `engine.io`, which manages the underlying transport protocol, including:
- Transport selection and upgrades (polling → WebSocket)
- Connection state management
- Packet encoding/decoding
In a Node.js environment, these components operate within the same process, communicating through an event emitter API. The `engine.io` server attaches to an HTTP server instance to handle both HTTP long-polling and WebSocket connections.
🧰 Tools
🪛 LanguageTool

[uncategorized] ~3-~3: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...### how socket.io works Socket.io (the top level library) have 2 main components: npm pa...

(EN_COMPOUND_ADJECTIVE_INTERNAL)


[uncategorized] ~5-~5: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...The socket.io packages deals with the high level concepts: namespace / room / clustering...

(EN_COMPOUND_ADJECTIVE_INTERNAL)


[misspelling] ~7-~7: Use “an” instead of ‘a’ if the following word starts with a vowel sound, e.g. ‘an article’, ‘an hour’.
Context: ...n in the same process, communicate with a event emitter API. ### develop for CF ...

(EN_A_VS_AN)


### develop for CF worker / DO

In CF DO / worker, JS runs in a non-Node.js special serverless environment. I think [workerd](https://github.com/cloudflare/workerd/tree/main/src/workerd) .

The biggest difference compared to Node.js / web for a JS developer is perhaps the volatile state.

In a traditional environment like Node.js process or a browser tab, the code just run till server down or tab close. But in CF the serverless environment they will stop running and destroy the in-memory state of your JS code when it is inactive. Having a JS `setTimeout` or `setInterval` timer counts as active. A pending HTTP request counts. An active WebSocket connection may or may not count (depending on the API used to accept the connection).

Specificlly, for DO the destruction of in-memory state is actually called [hibernation](). Developers can manually persist/revive state using provided KV store-like API.

<!--
For DO there is some guarantee like "no 2 instance of the same actor (identified by id) will exist at the same time". For worker I guess almost nothing is guaranteed. S
-->
Also the available standard libraries is different too.

The code using only JS language APIs should just work. Code requiring Node.js API can have Node.js polyfills behind Node.js compatibility flags.

Since sometime in 2024 the Node.js stdlib polyfill is based on [unenv]() , behind `nodejs_compat_v2` flag. This article has a quite complete explanation [Cloudflare Workersのnodejs\_compat\_v2で何が変わったのか](https://zenn.dev/laiso/articles/8280d026a08de0)

Prior to this, based on my non-authoritative investigation the `nodejs_compat` flag is eventually based on `ionic-team/rollup-plugin-node-polyfills` used by `@esbuild-plugins/node-modules-polyfill`, used by `esbuild`, used by `wrangler` CLI.

### how socket.io-serverless works

I used 2 DO to run heavily rewired `socket.io` `engine.io` code.

`class EngineActor extends DurableObject {...}` is the DO running `engine.io` code. It just accepts WebSocket connection, forwards bidirectional WS messages between `SocketActor` and real WS connection.

`class SocketActor extends DurableObject {...}` is the DO running `socket.io` code. It responds to RPC calls from `EngineActor`, emit messages into objects like `Namespace`. If application code above send message to a engine.io Socket (an abstraction of different transports), the message got forwarded to `EngineActor`, and flow to the other end of WS connection.

Therefore application logic code based on on `sio.Namespace` `sio.Client` `sio.Room` should work as with the original Socket.io, but with [limitations](https://github.com/jokester/socket.io-serverless?tab=readme-ov-file#limitations).

Besides the 2 DOs , there will need to be a worker entrypoint, a simple HTTP handler to forward request to `EngineActor`

While it is not impossible to prevent aforementioned hibernation (I did this in a first simpler version), I decided that a serverless version should instead exploit hibernation to save energy and protect our earth.

The states inside `EngineActor` `SocketActor`, including connection IDs, possibly dynamically created namespaces and conn IDs within, are now persisted/revived across different life cycles.

Most of `engine.io` `socket.io` code is already driven by message events. But there was a ping timer to drive [heartbeat check](https://socket.io/docs/v4/engine-io-protocol/#heartbeat). I had to stub the original code to use [alarm]() instead.

Currently socket.io-serverless only creates 1 instance for each DO class. In the future if performance becomes a problem it should be able to split the load with more DOs (similar to the adapter/cluster structure used by Socket.io)


54 changes: 0 additions & 54 deletions refactor-me.mjs

This file was deleted.

11 changes: 9 additions & 2 deletions socket.io-serverless/package.json
Original file line number Diff line number Diff line change
@@ -1,8 +1,16 @@
{
"name": "socket.io-serverless",
"description": "A custom socket.io build to run in Cloudflare workers.",
"version": "0.2.0",
"version": "0.2.1",
"type": "module",
"homepage": "https://github.com/jokester/socket.io-serverless",
"repository": {
"type": "git",
"url": "git+https://github.com/jokester/socket.io-serverless.git"
},
"bugs": {
"url": "https://github.com/jokester/socket.io-serverless/issues"
},
"dependencies": {},
"files": [
"dist",
Expand All @@ -13,7 +21,6 @@
"tsconfig.json"
],
"scripts": {
"prepack": "node build.mjs && cp -v ../README.md .",
"build": "node build.mjs",
"build:watch": "node build.mjs --watch",
"lint": "eslint src",
Expand Down
Loading