what about aiohttp.ClientSession? #71

wedobetter · 2022-03-04T17:28:16Z

wedobetter
Mar 4, 2022

Unless I am doing something wrong, it wont work with that.
My code looks like this:

async def fetch(something):
    await get(something)

async with limiter:
  await asyncio.gather(*[fetch(t) for t in range(100)])

Can you provide an example?

Thank you

Answered by mjpieters

Mar 4, 2022

You are gating access to the asyncio.gather() call. You haven't made it explicit, but I'm going to assume that that's not what you wanted, but that you instead wanted to limit get() calls.

Gather schedules each entry as a separate task, so they run in parallel independently. You are only rate limiting access to the asyncio.gather() call, and each time execution has reached that line, 100 tasks are created. Those tasks are themselves not rate limited. If you want to rate limit the get() calls, put your with block there, inside the task function.

You can share the limiter between tasks; you could pass it in as an argument:

async def fetch(limiter, something):
    async with limiter:
        a…

View full answer

mjpieters · 2022-03-04T17:53:51Z

mjpieters
Mar 4, 2022
Maintainer

You are gating access to the asyncio.gather() call. You haven't made it explicit, but I'm going to assume that that's not what you wanted, but that you instead wanted to limit get() calls.

Gather schedules each entry as a separate task, so they run in parallel independently. You are only rate limiting access to the asyncio.gather() call, and each time execution has reached that line, 100 tasks are created. Those tasks are themselves not rate limited. If you want to rate limit the get() calls, put your with block there, inside the task function.

You can share the limiter between tasks; you could pass it in as an argument:

async def fetch(limiter, something):
    async with limiter:
        await get(something)

await asyncio.gather(*[fetch(limiter, t) for t in range(100)])

So when each task runs, each task has to first ask for the rate limiter if it can run the get() call.

Note that the Bursting section in the documentation uses tasks too; the limiter is used inside the task coroutine.

6 replies

wedobetter Mar 4, 2022
Author

I have also tried to disable bursting as per docs:

In [13]: async def clientFetch(client, n):
    ...:     async with limiter:
    ...:         async with client.get(f"https://www.google.com/FAKEURL/{n}") as resp:
    ...:             print(datetime.now().strftime("%M:%S"), resp.status)
    ...:             await resp.text()
    ...: 

In [14]: 

In [14]: async def fetcher(reqs):
    ...:     async with ClientSession() as session:
    ...:         await asyncio.gather(*[clientFetch(session, r) for r in reqs])
    ...: 
    ...: 
In [16]: limiter = AsyncLimiter(1, 5)

In [17]: l.run_until_complete(fetcher([n for n in range(10)]))
21:15 404
21:15 404
21:15 404
21:20 404
21:20 404
21:25 404
21:25 404
21:30 404
21:30 404

mjpieters Mar 5, 2022
Maintainer

Your output shows it is working just fine:

In [10]: l.run_until_complete(asyncio.gather(*[fetch(n) for n in range(10)]))
2022-03-04 20:53:16.364388 404
2022-03-04 20:53:16.517303 404
2022-03-04 20:53:21.340143 404
2022-03-04 20:53:26.357350 404
2022-03-04 20:53:31.336756 404
2022-03-04 20:53:36.361057 404
2022-03-04 20:53:41.431073 404
2022-03-04 20:53:46.376688 404
2022-03-04 20:53:51.357404 404
2022-03-04 20:53:56.389631 404

Taking just the seconds values:

So after the initial burst the limiter let's through one request per 5 seconds == 2 requests per 10 seconds.

What makes you think it isn't working?

mjpieters Mar 5, 2022
Maintainer

I have also tried to disable bursting as per docs:

This code isn't using the same pattern as your first example. I also can't reproduce your output; your exact code, for me, outputs:

>>> l.run_until_complete(fetcher([n for n in range(10)]))
27:32 404
27:37 404
27:42 404
27:47 404
27:52 404
27:57 404
28:02 404
28:07 404
28:12 404
28:17 404

Your iPython output is missing a step; where is In [15] and it's output? What happened at that step?

wedobetter Mar 5, 2022
Author

I think the library should do what it says on the tin, at any given interval no more than max_requests should be executed, so this burst behaviour to me is not a feature but rather a bug.

mjpieters Mar 5, 2022
Maintainer

The library does what it says on the tin: it implements the leaky bucket algorithm. The documentation explicitly calls out the possibility of bursting and how to handle that.

wedobetter · 2022-03-05T09:01:49Z

wedobetter
Mar 5, 2022
Author

Right, for now I am resorting to using my own limiter routine, but I still hope that this package gets fixed

import asyncio
from datetime import datetime, timedelta
from aiohttp import ClientSession


async def limitCalls(tasks: list, max_rate=1, interval=0):
    n = 0
    results = []
    while _tasks := tasks[n:n+max_rate]:
        n += max_rate
        t = datetime.now() + timedelta(seconds=interval)
        results.extend(await asyncio.gather(*_tasks))
        if n < len(tasks):
            block = (t - datetime.now()).total_seconds()
            block > 0 and await asyncio.sleep(block)
    return results


async def fetch(n):
    async with ClientSession() as session:
        async with session.get(f"https://www.google.co.uk/SORRY_GOOGLE/{n}") as resp:
            print(datetime.now().strftime("%H:%M:%S"), resp.status)
            return await resp.text()

l = asyncio.get_event_loop()
l.run_until_complete(limitCalls([fetch(n) for n in range(10)], 3, 5))

### EXECUTION
time ./main.py 
08:57:30 404
08:57:30 404
08:57:30 404
08:57:35 404
08:57:35 404
08:57:35 404
08:57:40 404
08:57:40 404
08:57:40 404
08:57:45 404

real	0m15.211s

1 reply

mjpieters Mar 5, 2022
Maintainer

Right, for now I am resorting to using my own limiter routine, but I still hope that this package gets fixed

Sorry, maybe I am missing something, but aiolimiter is not broken.

Please be explicit in how the following script doesn't do what you wanted it to do:

import asyncio
from datetime import datetime

from aiolimiter import AsyncLimiter
from aiohttp import ClientSession

async def fetch(session, limiter, n):
    async with limiter:
        async with session.get(f"https://www.google.co.uk/SORRY_GOOGLE/{n}") as resp:
            print(datetime.now().strftime("%H:%M:%S"), resp.status)
            return await resp.text()

async def main():
    limiter = AsyncLimiter(3, 5)
    async with ClientSession() as session:
        await asyncio.gather(*(fetch(session, limiter, n) for n in range(10)))

asyncio.run(main())

Here the limiter allows for 3 requests across a 5 second window:

time ./main.py
11:03:50 404
11:03:50 404
11:03:50 404
11:03:51 404
11:03:53 404
11:03:55 404
11:03:56 404
11:03:58 404
11:04:00 404
11:04:01 404
main.py  0.27s user 0.06s system 2% cpu 12.044 total

Your script groups 3 requests every 5 seconds, and delays creating tasks. With the limiter, the tasks are all created up front but then the session.get() calls are delayed until there is 'space' in the rate limiter bucket.

If you wanted to see a burst of 3 requests every 5 seconds, then rate-limit creating 3 tasks at a time:

import asyncio
from datetime import datetime

from aiolimiter import AsyncLimiter
from aiohttp import ClientSession

async def fetch(session, n):
    async with session.get(f"https://www.google.co.uk/SORRY_GOOGLE/{n}") as resp:
        print(datetime.now().strftime("%H:%M:%S"), resp.status)
        return await resp.text()

async def main():
    limiter = AsyncLimiter(1, 5)
    async with ClientSession() as session:
        tasks = [fetch(session, n) for n in range(10)]
        while tasks:
            async with limiter:
                to_run, tasks = tasks[:3], tasks[3:]
                await asyncio.gather(*to_run)

asyncio.run(main())

which outputs:

time bin/python main.py
11:13:00 404
11:13:00 404
11:13:00 404
11:13:05 404
11:13:05 404
11:13:05 404
11:13:10 404
11:13:10 404
11:13:10 404
11:13:15 404
main.py  0.26s user 0.06s system 2% cpu 15.510 total

what about aiohttp.ClientSession? #71

Uh oh!

Uh oh!

wedobetter Mar 4, 2022

Replies: 2 comments · 7 replies

Uh oh!

Uh oh!

mjpieters Mar 4, 2022 Maintainer

Uh oh!

Uh oh!

wedobetter Mar 4, 2022 Author

Uh oh!

mjpieters Mar 5, 2022 Maintainer

Uh oh!

Uh oh!

mjpieters Mar 5, 2022 Maintainer

Uh oh!

wedobetter Mar 5, 2022 Author

Uh oh!

mjpieters Mar 5, 2022 Maintainer

Uh oh!

Uh oh!

wedobetter Mar 5, 2022 Author

Uh oh!

Uh oh!

mjpieters Mar 5, 2022 Maintainer

wedobetter
Mar 4, 2022

Replies: 2 comments 7 replies

mjpieters
Mar 4, 2022
Maintainer

wedobetter Mar 4, 2022
Author

mjpieters Mar 5, 2022
Maintainer

mjpieters Mar 5, 2022
Maintainer

wedobetter Mar 5, 2022
Author

mjpieters Mar 5, 2022
Maintainer

wedobetter
Mar 5, 2022
Author

mjpieters Mar 5, 2022
Maintainer