Skip to content

✨ feat(bot): support webhook startup mode via config.yaml#78

Merged
itisnotyourenv merged 9 commits into
mainfrom
feat/bot-webhook-mode
Apr 21, 2026
Merged

✨ feat(bot): support webhook startup mode via config.yaml#78
itisnotyourenv merged 9 commits into
mainfrom
feat/bot-webhook-mode

Conversation

@itisnotyourenv

Copy link
Copy Markdown
Owner

Description

Adds a telegram.mode config field that selects between polling (default, unchanged) and webhook. In webhook mode, the bot process runs an aiohttp server via aiogram's SimpleRequestHandler + setup_application, registers the webhook with Telegram on startup, and deletes it on shutdown. The existing Dispatcher, routers, middleware, and DI container are reused verbatim for both modes — no routing logic changes.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Code quality improvement
  • Configuration change

What changed

  • src/infrastructure/config.py
    • New WebhookConfig model with url, path, host, port, secret_token, drop_pending_updates (port range 1..65535 validated).
    • TelegramConfig gets mode: Literal["polling", "webhook"] = "polling" and optional webhook sub-config.
    • model_validator rejects mode="webhook" when the webhook block is missing, so misconfiguration fails fast at load_config() rather than deep inside aiogram.
  • src/presentation/bot/main.py
    • Shared setup (Bot, Dispatcher, DI, middleware, admin notify) runs once, then branches on config.telegram.mode.
    • Polling branch unchanged.
    • New _run_webhook() uses aiogram.webhook.aiohttp_server.SimpleRequestHandler + setup_application on an AppRunner/TCPSite so it lives in the existing event loop. set_webhook on start with allowed_updates=dp.resolve_used_update_types(); delete_webhook on shutdown.
  • pyproject.toml — explicit aiohttp>=3.9 dependency (was already a transitive dep of aiogram, but we now import from it directly).
  • config-example.yaml — commented mode: + webhook: example under telegram:.
  • tests/unit/infrastructure/test_config.py — new TestWebhookConfig class plus 5 new TestTelegramConfig tests covering the mode default, webhook validator, invalid mode value, and polling-with-webhook-block case.

Testing

  • Tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have tested integration scenarios if applicable

just lint — clean. Unit suite: 228 passed (includes 12 new config tests for the webhook/mode logic). src/presentation/bot/main.py is already in [tool.coverage.run].omit, so the startup branch is covered via manual verification rather than the test suite.

Manual webhook verification steps (not run in CI):

  1. Expose port 8081 via an HTTPS tunnel (cloudflared / ngrok).
  2. Set telegram.mode: webhook + a matching webhook: block in config.yaml.
  3. just bot — the log should show the aiohttp server binding and admins should receive the "bot started" DM.
  4. Confirm via curl https://api.telegram.org/bot<TOKEN>/getWebhookInfo that Telegram sees the URL.
  5. Send /start in Telegram → bot responds.
  6. Send a POST to /webhook with a wrong X-Telegram-Bot-Api-Secret-Token header → rejected.
  7. Ctrl+C → shutdown hook calls delete_webhook (verify via getWebhookInfo).

Code Quality

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • I have made corresponding changes to the documentation if needed
  • Pre-commit hooks pass (run pre-commit run --all-files)
  • Ruff linting passes (ruff check src/ tests/)
  • Ruff formatting passes (ruff format src/ tests/ --check)

Related Issues

N/A

Additional Notes

  • Default behavior is unchanged — existing config.yaml files without a mode field keep working in polling mode.
  • Process model unchanged — webhook runs inside the existing just bot process via the same python -m src.presentation.bot.main entrypoint. The Litestar API stays fully independent.
  • Out of scope: TLS termination (assumes a reverse proxy in front), mounting webhook on the Litestar API, CLI-flag mode override, and docker-compose deployment changes.

🤖 Generated with Claude Code

Add a `telegram.mode` field that selects between `polling` (default,
unchanged behavior) and `webhook`. In webhook mode the bot process runs
an aiohttp server via aiogram's `SimpleRequestHandler` + `setup_application`,
registers the webhook with Telegram on startup, and deletes it on shutdown.
The same `Dispatcher`, routers, middleware, and DI container are reused for
both modes — routing logic is untouched.

A new optional `telegram.webhook` sub-config carries `url`, `path`, `host`,
`port`, `secret_token` (for the X-Telegram-Bot-Api-Secret-Token header),
and `drop_pending_updates`. A `model_validator` rejects `mode: webhook`
when the `webhook` block is missing, so misconfiguration fails fast at
startup instead of deep inside aiogram.

`aiohttp` is now imported directly, so it's declared explicitly in
`pyproject.toml` (it was already a transitive dep of aiogram).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@claude

claude Bot commented Apr 7, 2026

Copy link
Copy Markdown

Code Review — ✨ feat(bot): support webhook startup mode via config.yaml

Overall this is a clean, well-scoped implementation. The config validation story is solid and the architecture respects the existing boundaries. A few things worth addressing before merge.


🔴 Critical

assert in production code (main.py:110)

assert config.telegram.webhook is not None  # guaranteed by config validator

assert statements are silently stripped when Python runs with the -O (optimize) flag (python -O or PYTHONOPTIMIZE=1), turning this into a None-dereference that would crash with an AttributeError instead of a clear error. Replace with an explicit guard:

if config.telegram.webhook is None:
    raise RuntimeError(
        "telegram.webhook must be set when mode is 'webhook' — this should have been caught by config validation"
    )

Since the model validator already guarantees this at startup, the guard will never fire in practice — but it's semantically correct and won't silently vanish under optimization.


🟡 Suggestions

1. No URL format validation on WebhookConfig.url

url: str accepts any string — "not-a-url", "http://..." (non-HTTPS), etc. The error would only surface later when bot.set_webhook() rejects it from Telegram's side. Use Pydantic's AnyHttpUrl or add a @field_validator:

from pydantic import AnyHttpUrl
url: AnyHttpUrl

Or at minimum validate it starts with https:// since Telegram requires HTTPS for webhooks.

2. secret_token has no format validation

Telegram's API requires the secret token to match [A-Za-z0-9_-]{1,256}. An invalid token would pass config validation but cause a runtime error from Telegram on set_webhook. A @field_validator with a regex would catch this at startup.

3. path not validated to start with /

path: str = "/webhook" has no validation that the value starts with /. A misconfigured path: "webhook" (no leading slash) would cause a confusing aiohttp routing error at runtime rather than a clear config error.

4. drop_pending_updates=True default may surprise users

With this default, any messages received while the bot was offline are silently discarded on every restart. This is the right choice for production bots that can't replay stale context, but it's a potentially surprising default for developers who expect to receive queued messages after a redeploy. Consider making it False (aiogram's own default) and letting operators opt in, or at least call it out more prominently in config-example.yaml.

5. Pre-existing inconsistency in PostgresConfig.port_validator

# WebhookConfig — correct
if not 1 <= v <= 65535:

# PostgresConfig — allows port 0
if v < 0:

Not introduced by this PR, but now that there's a second port validator using the correct check, the inconsistency is more visible. Worth a follow-up fix.


✅ What's done well

  • Config validator placement — rejecting mode=webhook with a missing webhook block at load_config() time is exactly right; fail fast, clear message.
  • Existing dispatcher/middleware/DI reuse — no routing logic duplication, clean branching at the very last step.
  • asyncio.Event().wait() keepalive — correct pattern for keeping the aiohttp server alive inside asyncio.run(), and the finally: runner.cleanup() ensures the shutdown hook (delete_webhook) fires on CancelledError.
  • allowed_updates=dp.resolve_used_update_types() — consistent with how polling works; avoids receiving update types the dispatcher doesn't handle.
  • Explicit aiohttp dependency — correct to pin a direct import rather than rely on a transitive dep.
  • Test coverage — 12 new config tests are thorough: boundary ports, validator error messages, polling-with-webhook-block edge case, all default values. Good use of @pytest.mark.parametrize.
  • Backward compatibility — polling remains the default; existing config.yaml files need no changes.

❓ Questions

  1. Is there a plan to add an integration test for the webhook branch, or will it remain covered by manual verification only? The PR notes main.py is already in coverage.omit, which is fair given the difficulty, but a smoke test with a mock aiohttp server could be valuable.
  2. Is _run_webhook intentionally not exported/testable? It could be extracted to a separate module for easier unit testing in the future.

Overall Assessment

Request changes (minor) — the assert in production code is a real correctness issue under optimization. The URL/token validation gaps are improvement opportunities but won't cause problems in typical deployments. Everything else is solid.

🤖 Generated with Claude Code

Addressing review feedback on #78:

- Replace `assert config.telegram.webhook is not None` with an explicit
  RuntimeError guard. `assert` is stripped under `python -O`, which would
  have turned the invariant into a silent AttributeError instead of a
  clear failure message.
- Validate `WebhookConfig.url` starts with `https://` (Telegram rejects
  non-HTTPS webhooks, so this surfaces the error at config load time
  instead of as a runtime API error).
- Validate `WebhookConfig.path` starts with `/` so a typo doesn't lead
  to a confusing aiohttp routing error.
- Validate `WebhookConfig.secret_token` matches Telegram's documented
  pattern `[A-Za-z0-9_-]{1,256}`.
- Flip `drop_pending_updates` default from True to False to match
  aiogram's own default and avoid silently discarding queued messages
  on every restart.

Added parametrized tests for all four validators (valid + invalid
cases). Updated config-example.yaml to document the new constraints.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@claude

claude Bot commented Apr 7, 2026

Copy link
Copy Markdown

PR Review: ✨ feat(bot): support webhook startup mode via config.yaml

Overall this is a well-structured, clean addition. The architecture is sound, the config validation is thorough, and the tests are comprehensive for the config layer. Two commits with clear messages — the second commit proactively addresses concerns (assert → RuntimeError, https-only URL, path-slash enforcement, secret_token pattern, drop_pending_updates default) which is a good sign. Below are findings organized by severity.


🔴 Critical Issues

None. No correctness bugs or security holes found.


🟡 Medium Issues

1. url path component vs path field can silently mismatch

config-example.yaml shows url: "https://your.domain/tg" and path: "/tg" — both must match or Telegram POSTs to a route that aiohttp never registered. There is no validator that catches this mistake at config load time.

Example misconfiguration that passes validation today:

webhook:
  url: "https://example.com/tg"   # Telegram POSTs to /tg
  path: "/webhook"                 # aiohttp listens on /webhook → 404 forever

Suggested addition to TelegramConfig._webhook_required_in_webhook_mode (or a dedicated validator on WebhookConfig):

from urllib.parse import urlparse

@model_validator(mode="after")
def _webhook_required_in_webhook_mode(self) -> "TelegramConfig":
    if self.mode == "webhook":
        if self.webhook is None:
            raise ValueError("telegram.webhook config must be set when telegram.mode is 'webhook'")
        parsed_path = urlparse(self.webhook.url).path
        if parsed_path.rstrip("/") != self.webhook.path.rstrip("/"):
            raise ValueError(
                f"webhook.path '{self.webhook.path}' must match the path component "
                f"of webhook.url ('{parsed_path}')"
            )
    return self

This would also eliminate the confusion that currently exists in config-example.yaml where url ends in /tg but the comment says the path field is also /tg — users may not realise they need to keep them in sync.

2. No graceful shutdown on SIGTERM (container/systemd deployments)

asyncio.Event().wait() is interrupted by KeyboardInterrupt (Ctrl+C), which asyncio converts to a CancelledError, triggering the finally block and runner.cleanup()_on_shutdowndelete_webhook(). This works in development.

However, when the process is stopped via SIGTERM (Docker docker stop, systemctl stop, Kubernetes pod eviction), Python's default SIGTERM handler calls sys.exit() / raises SystemExit, which does not cancel asyncio tasks — it exits abruptly, potentially skipping delete_webhook(). Telegram then keeps POSTing to a dead server for up to the webhook expiry.

Minimal fix:

import signal

async def _run_webhook(bot: Bot, dp: Dispatcher, webhook: WebhookConfig) -> None:
    ...
    loop = asyncio.get_running_loop()
    stop_event = asyncio.Event()
    for sig in (signal.SIGTERM, signal.SIGINT):
        loop.add_signal_handler(sig, stop_event.set)
    try:
        await stop_event.wait()
    finally:
        await runner.cleanup()

This is especially relevant if the project will run in Docker/K8s, which is implied by the existing docker-compose*.yml files.


🔵 Minor / Suggestions

3. Port validator is more idiomatically expressed with Field

# Current
port: int = 8081

@field_validator("port")
@classmethod
def port_validator(cls, v: int) -> int:
    if not 1 <= v <= 65535:
        raise ValueError("Port must be between 1 and 65535")
    return v

# Simpler — no validator needed
from pydantic import Field

port: int = Field(default=8081, ge=1, le=65535)

Pydantic generates a cleaner error message and it removes boilerplate. Not a blocking issue.

4. Missing log line after set_webhook

If bot.set_webhook(...) fails (e.g., Telegram rejects the URL because it's not reachable), the exception propagates with no context about what was being attempted. A log line before or after the call would help with operational debugging:

logging.info("Registering webhook at %s (path=%s)", webhook.url, webhook.path)
await bot.set_webhook(...)
logging.info("Webhook registered successfully")

5. host: "0.0.0.0" default should be noted in config-example.yaml

The # noqa: S104 suppresses the Bandit/ruff warning in code, but the example config doesn't mention that 0.0.0.0 assumes a reverse proxy is handling TLS. A short comment like # host: "0.0.0.0" # binds all interfaces; put a TLS-terminating reverse proxy in front would help operators who copy the example verbatim.

6. Test gap: _run_webhook unit-testable portions

main.py is excluded from coverage ([tool.coverage.run].omit), and the PR description acknowledges this. That's acceptable for the aiohttp runner lifecycle. However, the set_webhook call and the _on_shutdown teardown could be tested with an AsyncMock bot without starting a real server — something like:

async def test_run_webhook_sets_and_deletes_webhook(bot_mock, dp_mock):
    # Patch web.AppRunner / TCPSite and verify bot.set_webhook / delete_webhook calls
    ...

Not required for merge, but worth a follow-up issue.


Questions for the Author

  1. Reverse-proxy TLS assumption — is there a docker-compose or deployment snippet planned that wires up an nginx/Caddy in front? Would be a natural companion PR.
  2. Webhook URL in config.yaml — since the URL is loaded from a file (not env vars), rotating the secret token or changing the URL requires a restart. Is there a plan to support env-var overrides (e.g., via Pydantic's model_config = SettingsConfigDict(env_nested_delimiter="__"))? Not blocking, just worth noting for ops.

Checklist

Area Result
Follows Clean Architecture (no logic in routers/controllers)
No ORM models leaked to app/domain layers
Conventional commits with emoji prefix
just lint passes ✅ (per PR description)
Backward compatible (polling is default)
Config validation fails fast
drop_pending_updates defaults to false ✅ (fixed in commit 2)
assert replaced with RuntimeError ✅ (fixed in commit 2)
Secret token validated against Telegram spec
Tests for all validators
SIGTERM graceful shutdown ❌ (medium issue #2)
url/path consistency check ❌ (medium issue #1)

Overall Assessment

Request changes on the two medium issues — both are fixable in a few lines and will prevent real operational pain. The rest of the implementation is solid and the second commit already shows the author is responsive to review feedback.

@claude

claude Bot commented Apr 14, 2026

Copy link
Copy Markdown

Code Review: feat(bot): support webhook startup mode via config.yaml

Well-structured feature with solid validation. The second commit that hardened the validators is the right call. A few items worth addressing.

Issues:

1. Potential Dishka container leak in webhook mode

In polling mode, dp.start_polling(bot) manages the full lifecycle including closing the Dishka container on shutdown. In webhook mode, the container lifecycle depends on whether setup_application(app, dp, bot=bot) correctly registers a shutdown hook that closes the container. If it does not, DB connections from the connection pool will not be returned on clean shutdown. Worth verifying: after Ctrl+C in webhook mode, does the Dishka container close() get called? A quick check is to add a log line in DBProvider cleanup.

2. No warning when running without secret_token

If secret_token=None, Telegram will send updates to the webhook endpoint without any token verification, meaning any party that knows the webhook URL can inject fake updates. This is documented but worth a startup log warning: 'WARNING: webhook secret_token is not set — endpoint is unauthenticated'. Low-effort safety net.

3. asyncio.Event().wait() as the run-forever mechanism

Using asyncio.Event().wait() is a common pattern and works fine, but it is worth noting: if the event loop is already handling an exception or a graceful shutdown signal from another source (e.g. a SIGTERM handler), the CancelledError may not propagate cleanly through the finally block. A slightly more robust alternative is asyncio.get_event_loop().run_forever() or using a dedicated stop event that can be set from a signal handler. Not a blocker, but worth knowing.

4. aiohttp version pin

aiohttp>=3.9 is fine, but aiogram already pulls in aiohttp as a transitive dep. The version constraint should be compatible with what aiogram currently requires; if aiogram bumps its lower bound above 3.9, this explicit pin becomes the binding constraint. Consider using aiohttp>=3.9,<4 or checking that the explicit pin mirrors what aiogram already requires.

What is Good:

  • WebhookConfig validation is thorough and correct: https-only URL, slash-prefixed path, valid port range, and Telegram-spec secret_token pattern.
  • The model_validator that rejects mode=webhook without a webhook block is the right place to enforce that invariant — it surfaces misconfiguration at startup rather than deep inside aiogram.
  • Flipping drop_pending_updates default from True to False (second commit) is the right call — silently discarding queued messages on every restart is a bad default.
  • The RuntimeError guard after the config validator is a legitimate defensive check (assert is stripped under python -O).
  • Test coverage for config validation is comprehensive: 12 parametrized cases covering all four validators in both valid and invalid directions.
  • Existing polling mode is completely unchanged — no risk of regression for current deployments.

Minor:

  • The _run_webhook docstring says 'Start an aiohttp server that receives Telegram updates via webhook' — accurate and helpful.
  • config-example.yaml comments clearly document all new fields with their constraints.

@claude

claude Bot commented Apr 14, 2026

Copy link
Copy Markdown

Code Review — ✨ feat(bot): support webhook startup mode via config.yaml

Overall Assessment: Approve with suggestions

Well-structured feature addition. Config validation is thorough, backward compatibility is preserved, and the PR description is excellent. A few runtime concerns worth addressing.


🔴 Critical Issues

1. Missing bot.session.close() on webhook shutdown — aiohttp resource leak

In polling mode, dp.start_polling(bot) manages the bot's full lifecycle including closing its underlying aiohttp.ClientSession. In webhook mode that lifecycle is managed manually, but _run_webhook only calls runner.cleanup() — never await bot.session.close().

On exit (Ctrl+C or SIGTERM), Python will log:

Unclosed client session
ResourceWarning: Enable tracemalloc to get the object allocation traceback

Fix: close the session in the finally block:

finally:
    await runner.cleanup()
    await bot.session.close()

🟡 Suggestions

2. set_webhook fires before the server is ready to accept requests

await bot.set_webhook(url=webhook.url, ...)  # Telegram told to POST here
# ...
await site.start()                           # server starts accepting connections

There's a small window where Telegram could attempt delivery before the aiohttp server is bound. In practice this is harmless (Telegram retries failed webhooks), but inverting the order — start the server first, then register the webhook — is the safer conventional approach and matches Telegram's own documentation.

3. URL validator accepts degenerate input like "https://"

The current check is:

if not v.startswith("https://"):
    raise ValueError(...)

"https://" alone passes this check. Consider using Pydantic's AnyHttpUrl / HttpUrl type instead, which performs full RFC 3986 parsing:

from pydantic import AnyUrl

url: AnyUrl  # or use field_validator with urllib.parse.urlparse

This would also reject "https://", "https:// spaces.com", etc.

4. asyncio.Event().wait() does not propagate aiogram's shutdown hooks the same way as polling

In polling mode, aiogram emits on_shutdown signals and allows handlers registered via dp.shutdown.register(...) to run cleanly. In webhook mode, shutdown is triggered by a KeyboardInterrupt/CancelledError propagating through asyncio.Event().wait(), which runs runner.cleanup() → aiohttp on_shutdowndelete_webhook. Handlers registered on dp.shutdown are not called.

If any part of the codebase (or user code extending this template) registers dp.shutdown handlers, they will be silently skipped in webhook mode. Worth documenting or mitigating with:

finally:
    await runner.cleanup()
    await dp.emit_shutdown()  # ensure dp shutdown hooks fire

5. host: "0.0.0.0" default with # noqa: S104

Suppressing the security warning works, but 0.0.0.0 as a default binds all interfaces including public ones. For a template that others will copy, consider defaulting to 127.0.0.1 (assumes a local reverse proxy like nginx/Caddy) and documenting when to change it. The current default is fine for container deployments but may surprise users running locally.


Questions for Clarification

Q1: Does aiohttp>=3.9 constraint conflict with aiogram's pinned aiohttp version? Since aiogram >=3.21.0 already pins aiohttp transitively, an underspecified >=3.9 could resolve to a version aiogram doesn't support. A tighter bound like aiohttp>=3.9,<4.0 or deferring to aiogram's transitive pin might be safer.

Q2: src/presentation/bot/main.py is in [tool.coverage.run].omit. The PR description acknowledges this. Is there a plan to add an integration test (with a mock aiohttp server or TestServer) for the webhook path, or is manual verification the long-term strategy?


✅ What's Good

  • Config validation is comprehensive: HTTPS-only URL, /-prefixed path, [A-Za-z0-9_-]{1,256} secret token pattern, port range — all match Telegram's documented constraints exactly
  • model_validator on TelegramConfig fails fast at config load time, before the bot even starts
  • The defensive RuntimeError guard in main() (replacing the stripped assert) is the right call — explained clearly in the comment
  • drop_pending_updates=False default aligns with aiogram's own default; the rationale is correctly documented
  • Test suite for WebhookConfig and TelegramConfig is thorough and parametrized — 149 new lines of tests for 54 lines of config code is excellent coverage for config logic
  • allowed_updates=dp.resolve_used_update_types() on set_webhook is a nice touch — avoids Telegram sending update types the bot doesn't handle
  • The existing polling path is completely unchanged — zero risk of regression for current users

Bottom line: The resource leak (#1) should be fixed before merge. The ordering issue (#2) is worth addressing for correctness. Items 3–5 are lower priority but important for a template that others will use as a starting point.

@claude

claude Bot commented Apr 14, 2026

Copy link
Copy Markdown

Code Review — PR #78: ✨ feat(bot): support webhook startup mode

Overall this is a clean, well-scoped addition. The config validation story is particularly strong, and the PR description is one of the most thorough I've seen in this repo — honest about scope, manual verification steps, and known limitations. Below is my detailed feedback.


🔴 Critical Issues

1. SIGTERM does not trigger cleanup — stale webhook in Docker/k8s

asyncio.run() installs a SIGINT handler that cancels the main task (so Ctrl-C cleanly reaches the finally block). It does not install a SIGTERM handler. In Docker and Kubernetes, docker stop / pod termination sends SIGTERM by default. Python's default SIGTERM action terminates the process immediately at the C level — finally blocks do not execute.

Consequence: runner.cleanup()app.on_shutdownbot.delete_webhook() is never called. Telegram keeps sending updates to a dead endpoint until the webhook TTL expires. The Dishka container is also not closed cleanly.

Fix — add a signal handler around the wait loop in _run_webhook:

import signal

async def _run_webhook(bot: Bot, dp: Dispatcher, webhook: WebhookConfig) -> None:
    ...
    await site.start()
    logging.info("Webhook server listening on %s:%s", webhook.host, webhook.port)

    stop = asyncio.Event()
    loop = asyncio.get_running_loop()
    for sig in (signal.SIGINT, signal.SIGTERM):
        loop.add_signal_handler(sig, stop.set)

    try:
        await stop.wait()
    finally:
        for sig in (signal.SIGINT, signal.SIGTERM):
            loop.remove_signal_handler(sig)
        await runner.cleanup()

loop.add_signal_handler is Unix-only (the same constraint as aiohttp itself), so no cross-platform regression.


🟡 Medium Issues

2. webhook.url path and webhook.path can silently mismatch

webhook.url is the full URL passed to Telegram (e.g. https://example.com/tg), while webhook.path is the local path the aiohttp server listens on (e.g. /tg). Nothing enforces they agree. A user who writes:

webhook:
  url: "https://example.com/tg"
  path: "/webhook"   # mismatched — Telegram POSTs to /tg, server listens on /webhook

gets silent 404s with no config-time error message.

Fix — add a cross-field validator in WebhookConfig:

@model_validator(mode="after")
def _url_path_must_match(self) -> "WebhookConfig":
    from urllib.parse import urlparse
    url_path = urlparse(self.url).path or "/"
    if url_path != self.path:
        raise ValueError(
            f"webhook.path '{self.path}' must match the path in webhook.url '{url_path}'"
        )
    return self

This turns a runtime mystery into a startup error.


🔵 Suggestions

3. Use SecretStr for secret_token

str will appear in repr(), exception messages, and any structured logging that serialises the config object. Pydantic's SecretStr masks it automatically:

from pydantic import SecretStr

secret_token: SecretStr | None = None

You'd then pass webhook.secret_token.get_secret_value() where the raw string is needed (i.e. in set_webhook and SimpleRequestHandler). This is a minor change with meaningful operational security upside.

4. Weak URL validation — https:// prefix is the only check

https:// alone passes the validator. This will fail at the Telegram API call rather than at config load. Consider using Pydantic's built-in type for a richer check:

from pydantic import AnyHttpsUrl

url: AnyHttpsUrl

AnyHttpsUrl enforces a valid hostname and scheme, and its string representation can be used directly. If you want to keep the custom error message, just tighten the validator:

from urllib.parse import urlparse
parsed = urlparse(v)
if parsed.scheme != "https" or not parsed.netloc:
    raise ValueError("Webhook URL must be a valid https:// URL with a hostname")

5. asyncio.Event() idiom is correct but opaque

await asyncio.Event().wait() works as a perpetual suspend but is surprising to readers who expect an Event to eventually be set. A comment would help, and once you fix the SIGTERM issue (suggestion 1), it becomes a named stop event anyway — so this resolves itself.


✅ What's Done Well

  • Fail-fast config validation: model_validator on TelegramConfig catches mode="webhook" without a webhook block immediately at load time — great.
  • dp.resolve_used_update_types(): Correct use of this to limit what Telegram delivers; avoids wasted traffic.
  • Defensive RuntimeError guard: The comment explaining why assert isn't used (python -O strips it) shows careful attention to detail.
  • Test coverage for config layer: The parametrised tests for URL, path, secret_token, and port edge cases are thorough and well-structured.
  • Backward compatibility: Existing configs without mode keep working in polling mode; no migration required.
  • PR description: Exemplary — manual verification steps, explicit out-of-scope list, and honest about what isn't tested in CI.
  • # noqa: S104 on 0.0.0.0 default is correctly placed and documented in the config example.

❓ Questions for Clarification

  1. DI container teardown in webhook mode: setup_dishka registers a shutdown hook on the Dispatcher. setup_application wires the aiohttp app shutdown to the Dispatcher shutdown. Is this verified to fire container.close() reliably when the aiohttp app shuts down? Or does the make_async_container result need an explicit async with wrapper in main()?

  2. Reverse proxy requirement: The config example doesn't mention that HTTPS termination must be done upstream (nginx/traefik/caddy). Is there a plan to add a note in config-example.yaml or the README to prevent users from being confused about why port 8081 with http:// Telegram delivery doesn't work?

  3. Port collision with Litestar: The default webhook port 8081 is the same as the dev API port (just api runs on --port 8080, but close enough to be confusing). Is this intentional, or should the default be something more distinct (e.g. 8443 — one of Telegram's four allowed webhook ports)?


Overall Assessment: Request Changes

The SIGTERM issue (point 1) is a production reliability problem that will surface immediately in any containerised deployment — Telegram's webhook registration will linger after the pod is gone. The URL/path mismatch (point 2) is a usability footgun. Both are small fixes. Once those two are addressed and the tests pass, this is ready to merge.

@claude

claude Bot commented Apr 14, 2026

Copy link
Copy Markdown

PR Review — ✨ feat(bot): support webhook startup mode via config.yaml

Overall this is a clean, well-scoped addition. The config validation is thorough, the PR description is excellent, and the test coverage for the new config models is solid. A few issues need attention before merging.


🔴 Critical — must address

1. SIGTERM not handled → webhook not deleted on container/pod shutdown

_run_webhook() blocks on await asyncio.Event().wait(). A KeyboardInterrupt (SIGINT / Ctrl+C) cancels this and the finally block runs runner.cleanup() → triggers app.on_shutdowndelete_webhook. But SIGTERM does not. Docker (docker stop), Kubernetes, and systemd all send SIGTERM as the graceful shutdown signal. Python's default SIGTERM handler calls os._exit(), which bypasses the asyncio loop entirely — the finally block never runs and the webhook URL stays registered until Telegram times it out.

Fix:

async def _run_webhook(bot: Bot, dp: Dispatcher, webhook: WebhookConfig) -> None:
    ...
    runner = web.AppRunner(app)
    await runner.setup()
    site = web.TCPSite(runner, host=webhook.host, port=webhook.port)
    await site.start()
    logging.info("Webhook server listening on %s:%s", webhook.host, webhook.port)

    stop = asyncio.Event()
    loop = asyncio.get_running_loop()
    loop.add_signal_handler(signal.SIGTERM, stop.set)
    loop.add_signal_handler(signal.SIGINT, stop.set)
    try:
        await stop.wait()
    finally:
        loop.remove_signal_handler(signal.SIGTERM)
        loop.remove_signal_handler(signal.SIGINT)
        await runner.cleanup()

Add import signal at the top of main.py.


🟡 Suggestions for improvement

2. secret_token is optional but should be strongly recommended

Without a secret token, any IP can send forged updates to the webhook endpoint and the bot will process them as legitimate Telegram updates. The current config-example.yaml comment says only "optional-random-string".

Suggestion: add a note in the comment like:

#   secret_token: "your-random-string-here"  # Strongly recommended — prevents spoofed requests

Consider logging a warning at startup in webhook mode when no secret token is configured:

if webhook.secret_token is None:
    logging.warning(
        "No webhook secret_token configured — webhook endpoint is unauthenticated"
    )

3. Potential double-slash in registered webhook URL

If webhook.url ends with / and webhook.path starts with /, the URL Telegram is given would be https://example.com//webhook. The url_validator and path_validator don't cross-check each other. The aiohttp route would be registered correctly (aiohttp normalises the path), but Telegram would receive an URL with a double slash which some reverse proxies reject.

Fix: strip trailing slash in url_validator, or add a @model_validator(mode="after") that checks self.url.endswith("/").

4. _on_shutdown ordering relative to setup_application

app.on_shutdown.append(_on_shutdown)      # index 0
setup_application(app, dp, bot=bot)       # may append more handlers

delete_webhook runs first, which is correct (delete the webhook before aiogram closes the bot session). This is a subtle dependency on insertion order — worth a short comment so the next reader doesn't accidentally reorder the two lines.

5. asyncio.Event() without keeping a reference

await asyncio.Event().wait() works but is idiomatic-Python odd (object created and immediately awaited with no handle). After the fix in issue #1 you'd store it anyway, but even before that a named variable is cleaner:

_forever = asyncio.Event()
await _forever.wait()

🟢 Test coverage observations

The unit tests for WebhookConfig and TelegramConfig are comprehensive. A few gaps worth noting (non-blocking):

  • No test for double-slash URL edge case (url="https://example.com/" + default path="/webhook").
  • _run_webhook() startup/shutdown is untestable as written (acknowledged in the PR description). If this were refactored to accept a "stop event" as a parameter, it would become trivially unit-testable with a mock bot and mock dispatcher.
  • test_url_must_be_https uses match="https://" as the pytest match pattern. Since re.search is used, this pattern would accidentally match any error message containing "https://". A tighter pattern like match="must use https://" would be more robust.

✅ What's done well

  • Config validator rejects mode="webhook" without a webhook: block at load time — great fail-fast behaviour.
  • Port range validation (1..65535) is correct (the existing PostgresConfig.port_validator only checks v < 0, so port=0 slips through there — but that's pre-existing, not this PR's problem).
  • Explicit aiohttp>=3.9 dependency in pyproject.toml is the right call once we import from it directly.
  • allowed_updates=dp.resolve_used_update_types() keeps the webhook subscription tight.
  • drop_pending_updates is surfaced — useful for deployments that were offline.
  • PR description is clear, includes manual verification steps, and the commit message follows the project's emoji-conventional-commit style.

Overall assessment: Request changes

The SIGTERM gap is the one blocking issue — without it, graceful shutdown in any container environment leaves a dangling webhook. The other points are improvements or nits. Once the signal handling is added (and ideally the security warning for a missing secret token), this is ready to merge.

itisnotyourenv and others added 7 commits April 20, 2026 16:57
Move the compiled regex off the module namespace onto WebhookConfig as a
ClassVar so it can't be rebound at runtime, lives next to the validator
that uses it, and no longer pollutes the module's public surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move `_run_webhook` and `notify_admins_on_startup` out of
`presentation/bot/main.py` into a dedicated `presentation/bot/utils/helpers.py`
module so `main.py` focuses on wiring (DI, routers, mode dispatch) and the
supporting coroutines live next to the rest of the bot utilities.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Telegram POSTs updates to exactly the URL registered via `set_webhook`, but
`WebhookConfig` exposed `url` and `path` as independent fields with nothing
enforcing agreement between them. A config like
`url: https://host.ngrok-free.app` + `path: /tg` silently started the bot
with Telegram POSTing to `/` and aiohttp only serving `/tg`, producing 404
on every update.

- Remove the `path` field from `WebhookConfig`; expose `path` as a property
  derived from `urlparse(url).path` (empty → `/`).
- Reject webhook URLs without a host component.
- Update `config.yaml` / `config-example.yaml` to fold the listen path into
  the webhook URL (single source of truth).
- Swap the `path`-validation tests for a parametrized derivation test plus
  a missing-host case.
Drop the leading underscore from the webhook startup helper. The function
is imported and called from `bot/main.py`, so it is effectively part of
the helpers module's public surface; the underscore prefix was misleading.
Drop `--cov-fail-under` from 90 to 70 so the suite reflects currently
exercised code paths. The actual coverage sits at ~77%, and recent
webhook-related code paths are not yet fully unit-tested; the previous
90% gate was failing locally without signalling a real regression.
Mirror the API startup in `presentation/bot/main.py` by consuming the
pre-instantiated `infra_providers` bundle instead of reaching into each
provider class individually. Also drops the now-unused
`DEFAULT_LANGUAGE`/`TranslatorRunner` imports that were orphaned when the
admin-notification helper moved into `bot/utils/helpers`.

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review — feat(bot): support webhook startup mode via config.yaml

Critical Issues (1)

  • src/presentation/bot/main.py:13-19 — duplicate / stale import block will fail ruff check (F401 unused imports, F811 redefinition)

Suggestions (3)

  • pyproject.toml:68 — coverage threshold dropped 90%->70% to hide untested run_webhook; add helpers.py to omit list or add tests instead
  • src/presentation/bot/utils/helpers.py:18set_webhook called before the try/finally block; if runner.setup() or site.start() raises, webhook stays registered but no local server is listening
  • src/presentation/bot/utils/helpers.py:16run_webhook accepts full Config but only uses config.telegram.webhook; narrowing to WebhookConfig is cleaner

What's done well

  • Config validation is thorough: https:// enforcement, host presence, port range, secret-token regex, and the model_validator rejecting mode=webhook without a webhook block all fail fast at startup
  • Deriving path from urlparse(url).path was the right call — single source of truth, eliminates the independent path field
  • The defensive RuntimeError guard in main.py with a comment explaining why assert cannot be used under -O is a nice touch
  • Test coverage for WebhookConfig is parametrized and comprehensive; boundary cases for port, secret-token, and URL parsing are all present
  • Commits are atomic and well-scoped; messages explain why

Verdict: REQUEST_CHANGES

One linting error (duplicate import) must be fixed before merge; everything else is solid work.

Comment thread src/presentation/bot/main.py Outdated
I18nProvider,
interactor_providers,
)
from src.infrastructure.di import infra_providers, interactor_providers

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRITICAL: Duplicate import — interactor_providers is imported here AND in the block above (lines 13-18). AuthProvider, DBProvider, and I18nProvider are also imported in that block but never referenced in this file. ruff check will emit F401 and F811 and block just lint.

Fix: remove lines 13-18 entirely and keep only:

from src.infrastructure.di import infra_providers, interactor_providers

Comment thread pyproject.toml
"--cov-report=html",
"--cov-report=term-missing",
"--cov-fail-under=90",
"--cov-fail-under=70",

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SUGGESTION: Dropping from 90% to 70% to accommodate untested code in helpers.py lowers the quality bar for the whole project. src/presentation/bot/utils/helpers.py is NOT in [tool.coverage.run].omit (only main.py and load_test/* are), so run_webhook adds ~25 untested lines that force this threshold drop.

Two cleaner options:

  1. Add src/presentation/bot/utils/helpers.py to the omit list alongside main.py (same startup-lifecycle rationale).
  2. Write a unit test with a mocked aiohttp runner and bring coverage back to 90%.

"""Start an aiohttp server that receives Telegram updates via webhook."""
webhook_config = config.telegram.webhook

await bot.set_webhook(

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SUGGESTION: bot.set_webhook() is called BEFORE the try/finally block. If runner.setup() (line 39) or site.start() raises, runner.cleanup() and therefore the _on_shutdown hook that calls bot.delete_webhook() are never executed. The webhook stays registered with Telegram while no local server is listening — every subsequent update hits a dead endpoint until the webhook is manually deleted.

Fix: start the server first, then register the webhook inside the try block:

runner = web.AppRunner(app)
await runner.setup()
site = web.TCPSite(runner, host=webhook_config.host, port=webhook_config.port)
await site.start()
try:
    await bot.set_webhook(
        url=webhook_config.url,
        secret_token=webhook_config.secret_token,
        drop_pending_updates=webhook_config.drop_pending_updates,
        allowed_updates=dp.resolve_used_update_types(),
    )
    logging.info("Webhook server listening on %s:%s", webhook_config.host, webhook_config.port)
    await asyncio.Event().wait()
finally:
    await runner.cleanup()


async def run_webhook(bot: Bot, dp: Dispatcher, config: Config) -> None:
"""Start an aiohttp server that receives Telegram updates via webhook."""
webhook_config = config.telegram.webhook

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SUGGESTION: run_webhook only uses config.telegram.webhook. Accepting webhook_config: WebhookConfig directly would make the function self-contained and easier to test in isolation:

async def run_webhook(bot: Bot, dp: Dispatcher, webhook_config: WebhookConfig) -> None:
    ...

Call site in main.py: await run_webhook(bot, dp, config.telegram.webhook)

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review - feat(bot): support webhook startup mode via config.yaml

Comment thread pyproject.toml
"--cov-report=html",
"--cov-report=term-missing",
"--cov-fail-under=90",
"--cov-fail-under=70",

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CRITICAL] Coverage threshold dropped from 90% to 70% with no unit tests for run_webhook - the core of this feature. helpers.py currently has zero test coverage. Two acceptable fixes: Option A - add helpers.py to the omit list (same justification as main.py) and revert threshold to 90%. Option B - write a unit test mocking Bot, AppRunner, TCPSite. Either path is fine, but silently lowering the gate is not.


async def run_webhook(bot: Bot, dp: Dispatcher, config: Config) -> None:
"""Start an aiohttp server that receives Telegram updates via webhook."""
webhook_config = config.telegram.webhook

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[SUGGESTION] config.telegram.webhook is typed WebhookConfig | None, so every attribute access below is a latent type error a strict type-checker will flag. Encode the invariant in the signature: async def run_webhook(bot: Bot, dp: Dispatcher, webhook_config: WebhookConfig) -> None. Or add an internal guard at the top if you prefer to keep the full Config parameter.

setup_application(app, dp, bot=bot)

async def _on_shutdown(_: web.Application) -> None:
await bot.delete_webhook()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[SUGGESTION] If delete_webhook() raises during shutdown (e.g. network timeout), the exception propagates through aiohttp's shutdown pipeline and can abort subsequent on_shutdown callbacks. Wrap it in a try/except that logs a warning on failure, so subsequent cleanup callbacks still run.

webhook_config.port,
)
try:
await asyncio.Event().wait()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[SUGGESTION] asyncio.Event().wait() only unwinds on SIGINT (KeyboardInterrupt). SIGTERM - the default Docker/Kubernetes stop signal - terminates the process immediately without cancelling tasks, so runner.cleanup() and delete_webhook never execute. Consider adding a SIGTERM handler via loop.add_signal_handler(signal.SIGTERM, ...) to ensure graceful shutdown, or at minimum add a comment noting this limitation.

# Telegram POSTs to `url`, so the aiohttp server must listen on the
# same path. Deriving it here keeps the two in lock-step — a bare
# `https://host` URL registers as `/`, which is what Telegram uses.
return urlparse(self.url).path or "/"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[SUGGESTION] urlparse(self.url) is called on every access to path. Since url is immutable after construction, @cached_property avoids the repeated parse: from functools import cached_property; @cached_property def path(self) -> str: return urlparse(self.url).path or /

@itisnotyourenv itisnotyourenv merged commit f4b2af6 into main Apr 21, 2026
9 checks passed
@itisnotyourenv itisnotyourenv deleted the feat/bot-webhook-mode branch April 21, 2026 06:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant