Improve test reliability: server console gamemode, retries, NaN guard (#3871)

rom1504 · claude · web-flow · commit 7af5e2ccf2f7 · 2026-03-30T18:26:07.000-07:00
* Use server console for gamemode changes in tests Replace the fragile triple-chat-command trick in setCreativeMode with wrap.writeServer(), which sends the gamemode command directly to the server console. Wait for the game_state_change packet to confirm. This eliminates the most common source of flaky test timeouts: "Event message did not fire within timeout of 20000ms" in setCreativeMode/becomeCreative. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix game_state_change check for newer MC versions The reason field is a string ('change_game_mode') on newer versions and a number (3) on older ones. gameMode is a float, so use Math.floor. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Use server console for clearInventory too Replace chat-based /give + /clear with wrap.writeServer('clear flatbot'). Wait for set_slot packet confirming items removed. Skip if inventory is already empty. Eliminates updateSlot timeout flakiness. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix internal test timeOfDay race condition bot.time is initialized when the bot processes the login packet. Setting bot.time.timeOfDay before the login is processed causes "Cannot set properties of undefined" on fast/loaded runners. Wait for the bot's 'login' event before accessing bot.time. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix clearInventory: wait for updateSlot with empty inventory check The set_slot packet format differs across MC versions (blockId vs present field). Instead of checking packet fields, wait for the higher-level updateSlot event and verify inventory is actually empty. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix clearInventory: handle empty inventory + wait for sync When inventory is already empty, wait 2 ticks for any pending set_slot packets to arrive. When inventory has items, wait until all slots are cleared via updateSlot events. The server console command is reliable but we need to wait for the inventory state to sync. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix: prevent sending NaN position packets after death After death, bot.entity.position can become NaN (not null). NaN serializes as null in JSON, causing "Invalid move packet received" kicks. Guard all position send functions with Number.isFinite checks. Also simplify clearInventory to not use waitForTicks (which hangs when physics tick returns early due to NaN position). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix clearInventory race: don't send /clear when inventory is empty Sending /clear when inventory is already empty causes a race on 1.8.x where the server processes the clear after the next test has already set inventory slots, wiping them. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix clearInventory: give stone before clear to guarantee events The /clear command produces no events when inventory is already empty (stale local state). Give a stone first via server console to ensure the server always has something to clear, guaranteeing updateSlot events. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix: use bot.chat for /give, server console for /clear Server console /give produces admin chat messages that interfere with chat pattern tests. Use bot.chat for /give (as before) and server console only for /clear (which is the part that was flaky). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Move resetState from beforeEach hook into test runner Mocha's --retries only retries tests, not hooks. When resetState was in beforeEach, any timeout there failed the test with no retry. Now resetState runs inside runTest(), so --retries 2 covers both the reset and the actual test together. One change, no test files modified. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix clearInventory: use creative setInventorySlot instead of chat /give bot.chat('/give') generates chat messages that interfere with chat pattern tests (especially now that resetState runs inside the test where patterns are still registered). Use creative setInventorySlot which is silent. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix clearInventory: use raw set_creative_slot instead of creative API bot.creative.setInventorySlot waits for server confirmation, which conflicts with the subsequent /clear command that removes the same slot. Use a raw set_creative_slot packet write instead — no waiting, no conflict with the clear. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix clearInventory: use bot.chat for /clear, creative packet for /give Server console commands generate system_chat messages that interfere with chat pattern tests. Use: - set_creative_slot packet for giving stone (silent, no chat) - bot.chat('/clear') for clearing (the original reliable approach) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Log retry attempts for visibility Print "[retry N]" when a test is being retried, so we can see in CI logs whether --retries 2 is actually catching transient failures. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Increase retries to 3 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix: use this.test._currentRetry for retry logging this.currentRetry doesn't exist on mocha Context — the retry count is on this.test._currentRetry. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Disable retries after 3 distinct test failures When 3+ different tests fail, it indicates a systemic issue (not transient flakiness). Disable retries for remaining tests to avoid wasting time on cascading failures. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Revert clearInventory to original bot.chat approach The set_creative_slot packet doesn't produce updateSlot events on 1.19.2-1.19.3, causing 5 failures per version. Go back to the original bot.chat('/give @A stone 1') + bot.chat('/clear') which works across all versions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix bail-out: count failures on first attempt, not after all retries The bail-out was counting failures only after a test exhausted all retries, so the first 3 failing tests each wasted 3 retries before triggering the bail-out. Now count on first attempt — after 3 different tests fail, remaining tests run without retries. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: rom1504 <rom1504@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -68,7 +68,7 @@ jobs:
           exit_code=0
           pids=""
           for v in ${{ matrix.versions }}; do
-            npm run mocha_test -- --retries 2 -g "${v}v" > "test-${v}.log" 2>&1 &
+            npm run mocha_test -- --retries 3 -g "${v}v" > "test-${v}.log" 2>&1 &
             pids="$pids $!"
           done
           for pid in $pids; do
diff --git a/lib/plugins/physics.js b/lib/plugins/physics.js
@@ -76,6 +76,7 @@ function inject (bot, { physicsEnabled, maxCatchupTicks }) {
   }
 
   function tickPhysics (now) {
+    if (!bot.entity?.position || !Number.isFinite(bot.entity.position.x)) return // entity not ready
     if (bot.blockAt(bot.entity.position) == null) return // check if chunk is unloaded
     if (bot.physicsEnabled && shouldUsePhysics) {
       physics.simulatePlayer(new PlayerState(bot, controlState), world).apply(bot)
@@ -99,6 +100,7 @@ function inject (bot, { physicsEnabled, maxCatchupTicks }) {
 
   function sendPacketPosition (position, onGround) {
     // sends data, no logic
+    if (!Number.isFinite(position.x) || !Number.isFinite(position.y) || !Number.isFinite(position.z)) return
     const oldPos = new Vec3(lastSent.x, lastSent.y, lastSent.z)
     lastSent.x = position.x
     lastSent.y = position.y
@@ -122,6 +124,7 @@ function inject (bot, { physicsEnabled, maxCatchupTicks }) {
 
   function sendPacketPositionAndLook (position, yaw, pitch, onGround) {
     // sends data, no logic
+    if (!Number.isFinite(position.x) || !Number.isFinite(position.y) || !Number.isFinite(position.z)) return
     const oldPos = new Vec3(lastSent.x, lastSent.y, lastSent.z)
     lastSent.x = position.x
     lastSent.y = position.y
@@ -153,6 +156,8 @@ function inject (bot, { physicsEnabled, maxCatchupTicks }) {
   function updatePosition (now) {
     // Only send updates for 20 ticks after death
     if (isEntityRemoved()) return
+    // Don't send position with invalid coordinates (NaN after death)
+    if (!Number.isFinite(bot.entity.position.x)) return
 
     // Increment the yaw in baby steps so that notchian clients (not the server) can keep up.
     const dYaw = deltaYaw(bot.entity.yaw, lastSentYaw)
diff --git a/test/externalTest.js b/test/externalTest.js
@@ -61,7 +61,7 @@ for (const supportedVersion of mineflayer.testedVersions) {
           host: '127.0.0.1',
           version: supportedVersion
         })
-        commonTest(bot)
+        commonTest(bot, wrap)
         bot.test.port = PORT
 
         console.log('starting bot')
@@ -109,12 +109,6 @@ for (const supportedVersion of mineflayer.testedVersions) {
       } else begin()
     })
 
-    beforeEach(async () => {
-      console.log('Resetting state')
-      await bot.test.resetState()
-      console.log('State reset')
-    })
-
     after((done) => {
       if (bot) bot.quit()
       wrap.stopServer((err) => {
@@ -131,6 +125,7 @@ for (const supportedVersion of mineflayer.testedVersions) {
     })
 
     const externalTestsFolder = path.resolve(__dirname, './externalTests')
+    let distinctFailures = 0
     fs.readdirSync(externalTestsFolder)
       .filter(file => fs.statSync(path.join(externalTestsFolder, file)).isFile())
       .forEach((test) => {
@@ -139,8 +134,24 @@ for (const supportedVersion of mineflayer.testedVersions) {
         const runTest = (testName, testFunction) => {
           return function (done) {
             this.timeout(TEST_TIMEOUT_MS)
-            bot.test.sayEverywhere(`### Starting ${testName}`)
-            testFunction(bot, done).then(res => done()).catch(e => done(e))
+            // Disable retries if too many different tests have already failed
+            // on their first attempt (indicates a systemic issue, not flakiness)
+            if (distinctFailures >= 3) this.retries(0)
+            if (this.test._currentRetry > 0) {
+              console.log(`  [retry ${this.test._currentRetry}] ${testName}`)
+            }
+            bot.test.resetState()
+              .then(() => {
+                bot.test.sayEverywhere(`### Starting ${testName}`)
+                return testFunction(bot, done)
+              })
+              .then(res => done())
+              .catch(e => {
+                if (this.test._currentRetry === 0) {
+                  distinctFailures++
+                }
+                done(e)
+              })
           }
         }
         if (excludedTests.indexOf(test) === -1) {
diff --git a/test/externalTests/plugins/testCommon.js b/test/externalTests/plugins/testCommon.js
@@ -9,7 +9,7 @@ const { sleep, onceWithCleanup } = require('../../../lib/promise_utils')
 const timeout = 20000
 module.exports = inject
 
-function inject (bot) {
+function inject (bot, wrap) {
   console.log(bot.version)
 
   bot.test = {}
@@ -103,43 +103,39 @@ function inject (bot) {
     return setCreativeMode(false)
   }
 
-  const gameModeChangedMessages = ['commands.gamemode.success.self', 'gameMode.changed']
-
   async function setCreativeMode (value) {
-    const getGM = val => val ? 'creative' : 'survival'
-    // this function behaves the same whether we start in creative mode or not.
-    // also, creative mode is always allowed for ops, even if server.properties says force-gamemode=true in survival mode.
-    let i = 0
-    const msgProm = onceWithCleanup(bot, 'message', {
+    const mode = value ? 'creative' : 'survival'
+    const modeId = value ? 1 : 0
+    if (bot.game.gameMode === mode) return
+    // Use server console for instant, reliable gamemode change.
+    // The old approach (triple chat command + message parsing) was fragile
+    // and the most common source of flaky test timeouts.
+    const gameModePromise = onceWithCleanup(bot._client, 'game_state_change', {
       timeout,
-      checkCondition: msg => gameModeChangedMessages.includes(msg.translate) && i++ > 0 && bot.game.gameMode === getGM(value)
+      checkCondition: (packet) => {
+        // reason is 3 (number) on old versions, 'change_game_mode' (string) on new
+        const isGameModeChange = packet.reason === 3 || packet.reason === 'change_game_mode'
+        return isGameModeChange && Math.floor(packet.gameMode) === modeId
+      }
     })
-
-    // do it three times to ensure that we get feedback
-    bot.chat(`/gamemode ${getGM(value)}`)
-    bot.chat(`/gamemode ${getGM(!value)}`)
-    bot.chat(`/gamemode ${getGM(value)}`)
-    return msgProm
+    wrap.writeServer(`gamemode ${mode} flatbot\n`)
+    await gameModePromise
   }
 
   async function clearInventory () {
-    const giveStone = onceWithCleanup(bot.inventory, 'updateSlot', { timeout: 1000 * 20, checkCondition: (slot, oldItem, newItem) => newItem?.name === 'stone' })
-    await bot.test.wait(500)
+    // Give a stone then clear — same as the original approach.
+    // The /give ensures the server has something to clear.
     bot.chat('/give @a stone 1')
-    // Removed: was leaking a new listener every call
-    await giveStone
-
-    const clearInv = onceWithCleanup(bot, 'message', {
+    await onceWithCleanup(bot.inventory, 'updateSlot', {
+      timeout,
+      checkCondition: (slot, oldItem, newItem) => newItem?.name === 'stone'
+    })
+    const clearMsg = onceWithCleanup(bot, 'message', {
       timeout,
       checkCondition: msg => msg.translate === 'commands.clear.success.single' || msg.translate === 'commands.clear.success'
     })
-    bot.chat('/clear') // don't rely on the message (as it'll come to early), wait for the result of /clear instead
-    await clearInv
-
-    // Check that the inventory is clear
-    for (const slot of bot.inventory.slots) {
-      if (slot && slot.itemCount <= 0) throw new Error('Inventory was not cleared: ' + JSON.stringify(bot.inventory.slots))
-    }
+    bot.chat('/clear')
+    await clearMsg
   }
 
   // you need to be in creative mode for this to work
diff --git a/test/internalTest.js b/test/internalTest.js
@@ -868,9 +868,10 @@ for (const supportedVersion of mineflayer.testedVersions) {
       })
 
       server.once('playerJoin', (client) => {
-        bot.time.timeOfDay = 18000
         const loginPacket = bot.test.generateLoginPacket()
         client.write('login', loginPacket)
+        // Set timeOfDay after login is processed so bot.time is initialized
+        bot.once('login', () => { bot.time.timeOfDay = 18000 })
 
         const chunk = bot.test.buildChunk()