This could be added to the api models by interpreting response headers or maybe an option given to Reranker which limits the amount of requests per min.
For instance, Jina when not on premium is 60 rpm. Cohere is 10 rpm on trial key and 1000 rpm on production key