Description
It takes ~7 hours to run the HTTPRoute scale test, which creates 1000 HTTPRoutes sequentially.
This test waits for the previously created HTTPRoute to be configured (available in NGINX) plus 2 seconds before creating the next HTTPRoute. Testing revealed that the longer we wait before creating the next HTTPRoute, the faster NGF processes the HTTPRoute. See this graph for more details.
Some contributing factors to the long processing times are:
- NGF re-queues HTTPRoutes after updating their statuses. See Update the status of a resource only if the status changes #1013
- NGF writes the status for every HTTPRoute in the graph on every configuration update. See Check resource generation when processing updates of some resources to skip config regeneration #825
- NGF status updater is synchronous which slows down the event loop: See Make status reporter asynchronous #1014
This means if you have 99 HTTPRoutes configured and you create 1 more, NGF will update the configuration with this new route and then sequentially update the status of all 100 HTTPRoutes in the graph. Then, NGF will re-queue all 100 HTTPRoutes and process them again -- resulting in no configuration changes.
This situation can intensify if more HTTPRoutes are created while NGF is processing the last event batch or writing statuses. A new HTTPRoute can end up at the end of a large event batch that's full of no-op status changes.
Acceptance Criteria:
- Investigate why the processing times for the HTTPRoute scale test are so long. The contributing factors listed above may not be the only factors.
- Reduce the time it takes to process HTTPRoutes at scale. This can be measured by running the HTTPRoute scale test and comparing the results to the 1.0.0 results.
### Tasks
- [ ] https://github.com/nginxinc/nginx-gateway-fabric/issues/1013
- [ ] https://github.com/nginxinc/nginx-gateway-fabric/issues/825
- [ ] https://github.com/nginxinc/nginx-gateway-fabric/issues/1014
Metadata
Metadata
Assignees
Labels
Type
Projects
Status