Server bootstrap deferred shutdown callback fires `shutdown_callback` for channels that failed setup

Most of this investigation was done by GenAI, but has been reviewed:

### Describe the bug

The refactored server bootstrap shutdown path in `channel_bootstrap.c` defers the `shutdown_callback` through an async socket cleanup callback. The refactor removed an `else` guard that previously ensured `shutdown_callback` was only called for channels where `incoming_callback` had been called successfully (i.e. channels that completed setup). As a result, `shutdown_callback` can now fire for channels that failed during setup and may not have the expected handler slots.

This causes a fatal assert crash in downstream consumers (specifically `aws-c-event-stream`'s `s_on_accept_channel_shutdown`) on Windows, where the IOCP-based async socket cleanup creates a timing window that reliably triggers the bug during rapid server lifecycle cycling.

Filed as a follow-up to https://github.com/awslabs/aws-c-event-stream/issues/140.

This is a regression. The old code in `s_on_server_channel_on_shutdown` used an `if/else` that prevented `shutdown_callback` from firing on the error path. The refactored code removed the `else`, and the deferred callback now calls `shutdown_callback` unconditionally when `incoming_called` is true — but `incoming_called` is set to true by the error path itself.

The bug was introduced somewhere in the 54 commits between `aws-c-io` `fcb38c80` (bundled in `aws-crt-java` 0.33.5) and `bfb0819d` (bundled in `aws-crt-java` 0.43.6). The last known working version is `fcb38c80`.

### Regression Issue

- [x] Select this option if this issue appears to be a regression.

### Expected Behavior

`shutdown_callback` should only be called for channels that were successfully set up and had `incoming_callback` called with a non-error channel. This was the behavior of the old code:

```c
// OLD s_on_server_channel_on_shutdown — correct behavior
if (!channel_data->incoming_called) {
    s_server_incoming_callback(channel_data, error_code, NULL);  // error path
} else {
    args->shutdown_callback(server_bootstrap, error_code, channel, ...);  // only for successful setup
}
aws_channel_destroy(channel);
```

The `if/else` made the two paths mutually exclusive: either the channel failed setup (call `incoming_callback` with error) or it succeeded and is now shutting down (call `shutdown_callback`). Never both.

### Current Behavior

The refactored code calls `s_server_incoming_callback` on the error path (which sets `incoming_called = true`), then unconditionally defers to a socket cleanup callback that checks `incoming_called` and calls `shutdown_callback`:

```c
// NEW s_on_server_channel_on_shutdown
if (!channel_data->incoming_called) {
    s_server_incoming_callback(channel_data, error_code, NULL);
    // s_server_incoming_callback sets incoming_called = true
}
// No else — always sets up deferred callback
SETUP_SOCKET_SHUTDOWN_CALLBACKS(..., socket_shutdown_server_channel_shutdown_fn, ...)
aws_socket_clean_up(socket);
```

```c
// socket_shutdown_server_channel_shutdown_fn (fires asynchronously after IOCP socket close)
if (channel_data->incoming_called) {  // always true now, even for error path
    connection_args->shutdown_callback(
        server_bootstrap, error_code, shutdown_args->channel, server_shutdown_user_data);
}
aws_channel_destroy(shutdown_args->channel);
```

This causes `shutdown_callback` to fire for channels that failed setup. The channel passed to `shutdown_callback` may not have the handler slots that the callback expects, leading to crashes in the callback implementation.

Observed crash in `aws-c-event-stream` (`event_stream_rpc_server.c:354`):
```
Fatal error condition occurred in event_stream_rpc_server.c:354:
current_slot && "It should be logically impossible to have a channel in this callback
that doesn't have a slot in it"
Exiting Application
```

Process exit code: `0xC0000409` (`STATUS_STACK_BUFFER_OVERRUN` — Windows fast-fail for `abort()`).

### Reproduction Steps

This reproduces consistently on Windows via the [aws-greengrass-nucleus](https://github.com/aws-greengrass/aws-greengrass-nucleus) integration test suite, which rapidly creates and destroys event-stream RPC servers on local sockets:

```bash
git clone https://github.com/aws-greengrass/aws-greengrass-nucleus.git
cd aws-greengrass-nucleus
git checkout 2b4fb24  # "fix: update SDK and CRT version (#1768)" — the commit that upgraded to aws-crt 0.43.6
# On Windows with JDK 8 (Corretto):
mvn -ntp -U verify -Djava.io.tmpdir=C:\Temp
```

The crash occurs during `DeploymentConfigMergingTest`, which cycles through multiple Kernel instances (each with its own IPC RPC server) within a single JVM process.

Failing CI runs (Windows job fails, Linux job passes):
- https://github.com/aws-greengrass/aws-greengrass-nucleus/actions/runs/22873221124
- https://github.com/aws-greengrass/aws-greengrass-nucleus/actions/runs/22782086012

Passing CI runs with the previous `aws-c-io` version (before the CRT upgrade):
- https://github.com/aws-greengrass/aws-greengrass-nucleus/actions/runs/22082378155
- https://github.com/aws-greengrass/aws-greengrass-nucleus/actions/runs/20288329264

### Possible Solution

Restore the mutual exclusivity between the error path and the `shutdown_callback` path. The deferred callback should track whether `incoming_called` was already true *before* `s_on_server_channel_on_shutdown` ran, rather than checking the flag after the error path has set it.

One approach — add a flag to distinguish the two cases:

```c
// s_on_server_channel_on_shutdown
bool was_incoming_called = channel_data->incoming_called;

if (!channel_data->incoming_called) {
    error_code = (error_code) ? error_code : AWS_ERROR_UNKNOWN;
    s_server_incoming_callback(channel_data, error_code, NULL);
}

// Pass was_incoming_called to the deferred callback args
// ...
```

```c
// socket_shutdown_server_channel_shutdown_fn
if (shutdown_args->was_incoming_called) {  // only true for channels that completed setup
    connection_args->shutdown_callback(...);
}
aws_channel_destroy(shutdown_args->channel);
```

Alternatively, avoid calling `s_server_incoming_callback` on the error path inside `s_on_server_channel_on_shutdown` and instead handle the error notification through the deferred callback as well, preserving the original `incoming_called` state.

### Additional Information/Context

- The crash is 100% reproducible on `windows-latest` GitHub Actions runners after the CRT upgrade
- Linux never crashes — the same test suite passes cleanly (likely because socket cleanup is synchronous on Linux, so the timing window doesn't exist)
- The downstream consumer (`aws-c-event-stream`) also has a pre-existing missing-`return` bug in `s_on_accept_channel_setup` and lacks a defensive NULL check in `s_rpc_connection_from_channel`, both of which are tracked in https://github.com/awslabs/aws-c-event-stream/issues/140
- The `aws-c-io` version bundled in `aws-crt-java` 0.33.5 (`fcb38c80`) does not exhibit this crash
- The `aws-c-io` version bundled in `aws-crt-java` 0.43.6 (`bfb0819d`) reliably crashes

### aws-c-io version used

 `bfb0819d3906502483611ce832a5ec6b897c8421` (bundled in `aws-crt-java` 0.43.6). Last known working: `fcb38c804364dd627c335da752a99a125a88f6e9` (bundled in `aws-crt-java` 0.33.5).

### Compiler and version used

MSVC (as built by the `aws-crt-java` CI pipeline — native DLL path: `C:\Program Files (x86)\Jenkins\workspace\aws-crt-java-build-dll-win64\`)

### Operating System and version

Windows Server 2022 (GitHub Actions `windows-latest` runner), with Corretto JDK 8.482.08-1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Server bootstrap deferred shutdown callback fires `shutdown_callback` for channels that failed setup #793

Describe the bug

Regression Issue

Expected Behavior

Current Behavior

Reproduction Steps

Possible Solution

Additional Information/Context

aws-c-io version used

Compiler and version used

Operating System and version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Server bootstrap deferred shutdown callback fires shutdown_callback for channels that failed setup #793

Description

Describe the bug

Regression Issue

Expected Behavior

Current Behavior

Reproduction Steps

Possible Solution

Additional Information/Context

aws-c-io version used

Compiler and version used

Operating System and version

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Server bootstrap deferred shutdown callback fires `shutdown_callback` for channels that failed setup #793