Skip to content

Tkinter UI for icrawler #123

Open
Open
@Patty-OFurniture

Description

@Patty-OFurniture

I have a simple UI in Tkinter, which fixes several issues, WITHOUT changing the core library. If you are interested, it does show some interesting things you can do with icrawler. Yes it might seem like a mess, but if you are already using icrawler it should be clear. I can write python, and I am learning tkinter, but suggestions are welcome on my Issues list. Most things work and I want to add more.

I forked the whole project in case I needed to do fixes, but the UI is all in /examples/

  • FileTypes.py
  • FilenameDownloader.py
  • GoogleLanguageOptions.py
  • iCrawlerTK.py
  • iCrawlerTK.yaml
  • logging.conf

https://github.com/Patty-OFurniture/icrawler

#98 - keep_file() override in FilenameDownloader checks file type, you can return False if extension != "jpg"
#111 - example how to override set_logger() for full control (commented out for me)
#108 - get file name (from Content-Disposition or URL)
#108 - also log (INFO) image #, filename, URL. You can change the formatting, log to a file, or whatever else you want
#117 and #107- log (DEBUG) the Google content if no images are found to help resolve, if it's still a problem
#110 - a similar log could be done for Bing. Not implemented, but easily copied (google.py)
#106 - a keyword separator option, so you san enter, for example: "beans|rice" and search first "beans" then "rice", separately
#103 - google language selection fix should help Baidu, since it adds headers to look more like a web browser and avoid getting flagged.
#104 - google language selection should help. Common languages are in GoogleLanguageOptions.py, add to it if you need to
#61 - sort of fixed, it creates a directory for each keyword. "rice" goes in storage/rice/, "beans" in storage/beans/ - hopefully it is a good example.
#121 - a better, but not perfect, check for disk space errors, in the core library

Also image type detection for #108, finding the correct file extension

Thanks to hellock for the library, I'm just making it easier for me to use!

Have fun!
Patty

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions