Currently we provide filters which can remove (or add) valid routes for requests. However we don't have a great way today to express "these 3 clusters are valid. Score them based on data/metrics/etc, and pick the one which scores best", prior to load-balancing.
The purpose of this spike is to analyze good ways to do this, looking and prior art like llm-d (which has filters, scorers and pickers concepts) and then make a plan to add this capability into Praxis as native filters.
Currently we provide filters which can remove (or add) valid routes for requests. However we don't have a great way today to express "these 3 clusters are valid. Score them based on data/metrics/etc, and pick the one which scores best", prior to load-balancing.
The purpose of this spike is to analyze good ways to do this, looking and prior art like llm-d (which has filters, scorers and pickers concepts) and then make a plan to add this capability into Praxis as native filters.