Skip to content

⚡ Add request timeout handling #10

@Gilfeather

Description

@Gilfeather

Description

Implement request timeout handling to prevent long-running requests from blocking the server.

Current Status - Partially Implemented

Model-level timeout exists but HTTP-level timeout handling is not implemented.

Already Implemented

  • Model-level timeout in predict_batch_with_timeout method
  • Timeout logging and monitoring
  • Graceful timeout error handling at model level

Remaining Work

  • Add configurable HTTP request timeout (default 30 seconds)
  • Return HTTP 408 Request Timeout for exceeded requests
  • Add timeout configuration to CLI arguments
  • Add timeout middleware to Axum router
  • Add tests for HTTP timeout behavior
  • Update documentation with timeout information

Implementation Guidance

  • Use tower::timeout::TimeoutLayer for HTTP-level timeouts
  • Add --request-timeout CLI argument
  • Consider different timeouts for different endpoints
  • Integrate with existing model timeout for layered timeout handling

Estimated Difficulty

Easy - Half day (reduced due to existing timeout infrastructure)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions