Skip to content

Use __EVENTS__.context() for main-thread#891

Open
MaxDall wants to merge 1 commit intomasterfrom
fix-a-bug-with-main-thread-alias
Open

Use __EVENTS__.context() for main-thread#891
MaxDall wants to merge 1 commit intomasterfrom
fix-a-bug-with-main-thread-alias

Conversation

@MaxDall
Copy link
Collaborator

@MaxDall MaxDall commented Mar 4, 2026

This PR fixes a bug that got "introduced" (actually it was there before but didn't cause any problems) with #890, where the alias for the main thread got overwritten when the EVENTS dictionary was reset.

@MaxDall MaxDall requested a review from addie9800 March 4, 2026 20:44
@MaxDall MaxDall added the bug Something isn't working label Mar 4, 2026
@addie9800
Copy link
Collaborator

Unfortunately, this still seems to be unstable. When running:

from logging import DEBUG

from fundus import PublisherCollection
from fundus.logging import set_log_level
from fundus.scraping.filter import Requires

crawler = Crawler(PublisherCollection.de)
set_log_level(DEBUG)

if __name__ == "__main__":
    for article in crawler.crawl(max_articles_per_publisher=10, only_complete=False, error_handling="suppress"):
        print(article)

I end up with a KeyError:

2026-03-05 13:40:26,965 - fundus.utils.events - DEBUG - Set event 'stop' for 7740   (Sportschau)
[...]
2026-03-05 13:40:27,072 - fundus.scraping.crawler - DEBUG - Shutting down 'Crawler' ...
2026-03-05 13:40:27,072 - fundus.utils.events - DEBUG - Set event 'stop' for 20268  (main-thread)
[...]
2026-03-05 13:40:27,404 - fundus.parser.utility - DEBUG - Skipping lazy loading image
2026-03-05 13:40:27,499 - fundus.scraping.session - DEBUG - (200) <GET 'https://www.ndr.de/kultur/plattdeutsch-immer-populaerer-grosser-zulauf-fuer-vhs-kurse-in-sh,platt-106.html'> took 0.383954 second(s)
2026-03-05 13:40:27,499 - fundus.scraping.session - DEBUG - (200) <GET 'https://www.golem.de/news/denza-und-yangwang-byd-praesentiert-blade-batterie-mit-groesserer-reichweite-2603-206131.html'> took 0.39678 second(s)
2026-03-05 13:40:27,514 - fundus.parser.base_parser - INFO - Couldn't parse attribute 'images' for 'https://www.t-online.de/nachrichten/deutschland/innenpolitik/id_101153614/neue-umfrage-fuer-landtagswahl-in-baden-wuerttemberg-cdu-mit-vorsprung.html': ValueError('Bounds could not be determined')
2026-03-05 13:40:27,533 - fundus.parser.utility - DEBUG - Pixel calculation not implemented for rem
2026-03-05 13:40:27,533 - fundus.parser.utility - DEBUG - Pixel calculation not implemented for em
2026-03-05 13:40:27,534 - fundus.parser.utility - DEBUG - Pixel calculation not implemented for em
2026-03-05 13:40:27,534 - fundus.parser.utility - DEBUG - Pixel calculation not implemented for em
2026-03-05 13:40:27,593 - fundus.parser.utility - DEBUG - url data:image/svg+xml;base64,PHN2ZyB4bWxucz0naHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmcnIHZpZXdCb3g9JzAgMCA5NjAgNjQwJyB3aWR0aD0nOTYwJyBoZWlnaHQ9JzY0MCcgPjwvc3ZnPg== is not a valid URL
2026-03-05 13:40:27,593 - fundus.parser.utility - DEBUG - url data:image/svg+xml;base64,PHN2ZyB4bWxucz0naHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmcnIHZpZXdCb3g9JzAgMCA1MjAgMzkxJyB3aWR0aD0nNTIwJyBoZWlnaHQ9JzM5MScgPjwvc3ZnPg== is not a valid URL
2026-03-05 13:40:27,711 - fundus.parser.base_parser - INFO - Couldn't parse attribute 'images' for 'https://winfuture.de/downloadvorschalt,4010.html': ValueError('Bounds could not be determined')
2026-03-05 13:40:27,716 - fundus.utils.events - DEBUG - Cleared event 'stop' for 20268  (main-thread)
2026-03-05 13:40:27,716 - fundus.scraping.crawler - DEBUG - Shutdown done
Traceback (most recent call last):
  File "C:\Users\ad123\AppData\Roaming\JetBrains\PyCharm2025.1\scratches\scratch.py", line 28, in <module>
    for article in crawler.crawl(max_articles_per_publisher=10, only_complete=False, error_handling="suppress"):
  File "D:\Arbeit\SHK\Fundus\fundus\src\fundus\scraping\crawler.py", line 415, in crawl
    if (isinstance(self, Crawler) and self.threading) and not __EVENTS__.is_event_set(
  File "D:\Arbeit\SHK\Fundus\fundus\src\fundus\utils\events.py", line 272, in is_event_set
    return self._events[self._resolve(key)][event].is_set()
  File "D:\Arbeit\SHK\Fundus\fundus\src\fundus\utils\events.py", line 138, in _resolve
    return self._aliases[key]
  File "D:\Arbeit\SHK\Fundus\.fundus-venv\lib\site-packages\bidict\_base.py", line 524, in __getitem__
    return self._fwdm[key]
KeyError: 'Sportschau'

@MaxDall
Copy link
Collaborator Author

MaxDall commented Mar 6, 2026

@addie9800 Thanks a lot for catching this. I will look into this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants