Releases: Addono/HathiTrust-downloader
Releases · Addono/HathiTrust-downloader
Release v1.4.0
This release brings several new features focused on improving download reliability and user control, along with important fixes and documentation updates.
✨ New Features
- Enhanced Download Retries & Control:
- Downloads will now automatically retry up to 8 times by default if an error occurs, an increase from previous versions, making downloads more resilient to temporary network issues.
- You can now customize the maximum number of retries using the new
--max-retriesflag. For example,hathitrust-downloader --max-retries 10 [BOOK_ID_OR_URL]. - When encountering "Forbidden" (403) errors from HathiTrust, the downloader will now use an exponential backoff strategy. This means it waits progressively longer between retries, increasing the chance of success without overwhelming the HathiTrust servers.
- Custom User-Agent:
- Advanced users can now specify a custom User-Agent string for all requests made by the downloader using the
--user-agentflag. This can be useful for mimicking specific browsers or for troubleshooting.
- Advanced users can now specify a custom User-Agent string for all requests made by the downloader using the
🐛 Bug Fixes & Reliability
- Correct User-Agent Handling:
- Fixed an issue where the User-Agent string was not being correctly set on all outgoing requests. This improves compatibility and reliability when interacting with HathiTrust servers.
- Updated Example ID:
- The reference book ID used in examples or tests has been updated to ensure it points to a currently accessible item.
📚 Documentation
- Using URLs as Input:
- The README documentation has been updated to clarify that you can now directly pass a full HathiTrust book URL to the downloader, in addition to just the book ID. This offers more flexibility in how you specify the book to download.
🏗️ Build & Packaging
- Standardized Package Name:
- The package name has been updated to
hathitrust-downloaderto comply with Python packaging standards (PEP 625).
- The package name has been updated to
Release v1.3.0
What's Changed
- Adds optional support for giving a complete URL as the ID field by @Addono in #22. Now, the application will try to parse the ID automatically from the complete URL. This feature is considered merely as a "best-effort" enhancement and might not work on all types of URLs or if the URL structure changes. The original behavior of giving the ID directly is still to be considered the primarily supported method.
Full Changelog: v1.2.1...v1.3.0
Release v1.2.1
New Features and Enhancements:
- Improved Error Handling:
- Added specific error messages for HTTP status codes:
404 Not Found: Provides detailed error when a page is not found.500 Server Error: Prints a message indicating that the server failed to serve a page, which may indicate an invalid book identifier.
- Improved user experience by offering clearer guidance in case of errors during downloads.
- Added specific error messages for HTTP status codes:
Technical Improvements
CI/CD Updates:
- Dependabot Integration:
- Dependabot configuration added to automatically manage and update GitHub Action dependencies on a weekly schedule.
Testing Improvements:
- Bats Tests Integration:
- Implemented Bats tests for the HathiTrust Downloader module and integrated them into the CI pipeline.
- Tests cover scenarios such as:
- Checking CLI availability.
- Downloading single and multiple pages.
Release v1.2.0
Release Notes
New Features:
- Automatic Directory Creation: The downloader now automatically creates the directory path if it doesn't exist when saving files. This prevents errors related to missing directories and ensures a smoother download process (commit
ed5e60e).
Dependency Updates:
requestsUpdated to v2.32.3: Updated therequestspackage from v2.32.2 to v2.32.3 to include the latest bug fixes and performance improvements (commitbe8b670).tqdmUpdated to v4.66.5: Updated thetqdmpackage from v4.66.3 to v4.66.5 to benefit from the latest progress bar enhancements and fixes (commitc96d736).
Target Python Version Update:
- Python 3.10 Support: Updated the project configuration to officially support Python 3.10. This includes updating the GitHub Actions workflow and setup files to target Python 3.10, ensuring compatibility with the latest features and optimizations in the Python ecosystem (commit
4e00042).
Documentation Improvements:
- Help Text Update: Improved the help text in the CLI to reflect more accurate and complete usage information. The description now correctly states "Book downloader for HathiTrust" and clarifies the usage of the
--nameargument for specifying file paths (commitad3a7ec).
Full Changelog: v1.1.5...v1.2.0
Release v1.1.5
This release includes a couple of security fixes by including newer versions of our upstream dependencies. In addition, the documentation on how to obtain the document ID has been slightly improved.
What's Changed
- build(deps): bump tqdm from 4.48.2 to 4.66.3 by @dependabot in #9
- build(deps): bump requests from 2.31.0 to 2.32.2 by @dependabot in #11
Full Changelog: v1.1.4...v1.1.5
Release v1.1.4
Note
This is a re-release of 1.1.2 to ensure the Windows-installer files are attached to this release.
Release v1.1.4-rc.1
Full Changelog: v1.1.3...v1.1.4-rc.1
v1.1.3
Release v1.1.2
Warning
Broken CI prevented from creating a release artifact. Instead use release 1.1.4.
What's Changed
- build(deps): bump requests from 2.24.0 to 2.31.0 by @dependabot in #8
New Contributors
- @dependabot made their first contribution in #8
Full Changelog: v1.1.1...v1.1.2
Release v1.1.1
refactor: adds main function as to have a correct entrypoint