Skip to content

Navigation Menu

Appearance settings

View all features
- BY COMPANY SIZE
  Enterprises
  Small and medium teams
  Startups
  Nonprofits
- BY USE CASE
  App Modernization
  DevSecOps
  DevOps
  CI/CD
  View all use cases
- BY INDUSTRY
  Healthcare
  Financial services
  Manufacturing
  Government
  View all industries
View all solutions
- EXPLORE BY TOPIC
  AI
  Software Development
  DevOps
  Security
  View all topics
- EXPLORE BY TYPE
  Customer stories
  Events & webinars
  Ebooks & reports
  Business insights
  GitHub Skills
- SUPPORT & SERVICES
  Documentation
  Customer support
  Community forum
  Trust center
  Partners
View all resources
- COMMUNITY
  GitHub SponsorsFund open source developers
- PROGRAMS
  Security Lab
  Maintainer Community
  Accelerator
  GitHub Stars
  Archive Program
- REPOSITORIES
  Topics
  Trending
  Collections
- ENTERPRISE SOLUTIONS
  Enterprise platformAI-powered developer platform
- AVAILABLE ADD-ONS
  GitHub Advanced SecurityEnterprise-grade security features
  Copilot for BusinessEnterprise-grade AI features
  Premium SupportEnterprise-grade 24/7 support
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

Appearance settings

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

icloud-photos-downloader / icloud_photos_downloader Public

Notifications You must be signed in to change notification settings
Fork 783
Star 11.9k

Code
Issues 126
Pull requests 34
Discussions
Actions
Projects
Wiki
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security and quality
Insights

Breadcrumbs

icloud_photos_downloader

/

TODO.md

Copy path

More file actions

More file actions

Latest commit

History

History

180 lines (150 loc) · 8.55 KB

Breadcrumbs

icloud_photos_downloader

/

TODO.md

File metadata and controls

180 lines (150 loc) · 8.55 KB

Copy raw file

Download raw file

Outline

Edit and raw actions

iCloud Photos Downloader Improvement Checklist

Last updated: 2026-03-03

Use this as the source of truth for implementation progress. Every code change should update this file.

0. Project hygiene and tracking

Add/update architecture note describing current pipeline and target pipeline.
Keep this checklist aligned with actual implemented code and tests.
For each completed task, reference the related PR/commit in this file.
Keep changelog/release notes in sync when user-facing flags/behavior change.
Ensure local development/testing uses Python 3.13 in .venv to match project constraints.

1. Unified retry and backoff (metadata + downloads)

1.1 Policy and configuration

Define one retry policy module shared by metadata calls and file downloads.
Add CLI option: --max-retries (default target: 6).
Add CLI option: --backoff-base-seconds.
Add CLI option: --backoff-max-seconds.
Add CLI option: --respect-retry-after/--no-respect-retry-after.
Add CLI option: --throttle-cooldown-seconds.
Ensure defaults preserve safe behavior for existing users.

1.2 Error classification

Classify fatal auth/config errors as no-retry (invalid creds, MFA unavailable, ADP/web-disabled).
Classify session-invalid errors as re-auth-then-retry.
Classify transient errors as retryable (429, 503, timeouts, connection resets, throttling-like denials).
Centralize retry decision logging (attempt, reason, next delay).

1.3 Integration points

Apply shared retry policy to album/asset enumeration calls.
Apply shared retry policy to download calls.
Remove/replace duplicated ad-hoc retry loops in existing code paths.
Add jitter to exponential backoff.
Honor Retry-After when present on retryable responses.

1.4 Verification

Unit tests for retry classifier.
Unit tests for backoff math and jitter bounds.
Unit tests for Retry-After handling.
Integration tests: metadata retry behavior under simulated 429/503.
Integration tests: download retry behavior under simulated 429/503/reset.

2. Persistent state DB and resumable task queue

2.1 Data model

Add --state-db option (or equivalent path option) with sensible default.
Create DB initialization/migration path.
Create assets table.
Create tasks table with status/attempt/error fields.
Create checkpoints table for pagination progress.
Add indexes for task leasing and status filtering.

2.2 Enumeration persistence

Persist enumerated assets in batches.
Persist tasks per asset version.
Save checkpoint every page (or configurable page interval).
Resume enumeration from checkpoint after restart.

2.3 Worker/task lifecycle

Add task states: pending, in_progress, done, failed.
Add lease timestamp/owner for in_progress.
Requeue stale leased tasks on startup.
Track per-task attempts and last error.

2.4 Verification

Unit tests for DB schema creation and migrations.
Unit tests for lease/requeue behavior.
Integration test: crash mid-run and resume without redoing completed tasks.
Integration test: checkpoint resume after partial enumeration.

2.5 URL freshness

Detect expired/invalid persisted download URLs and refresh asset version metadata.
Add task/state marker for URL refresh path (e.g., needs_url_refresh) and retry flow.

3. Bounded adaptive concurrency

3.1 CLI and defaults

Add --download-workers option (default target: 4).
Keep metadata enumeration single-threaded by default.
Document deprecation relationship with --threads-num.

3.2 Limiting and adaptation

Implement shared account-level limiter for download workers.
Separate metadata and download request budgets (if needed by code design).
Implement AIMD or equivalent adaptive reduction on throttling events.
Add global cool-down behavior when repeated throttle signals occur.

3.3 Session/cookie safety

Audit all session/cookie writes under concurrent access.
Add locking or redesign to avoid concurrent write races.
Ensure no cookie/session corruption under multithreaded runs.

3.4 Verification

Unit tests for limiter/token bucket behavior.
Concurrency tests for session persistence safety.
Integration tests for worker pool drain/stop/restart behavior.
Benchmark runs at workers = 1, 2, 4, 8 and record throughput + error rate.

4. Download efficiency and integrity

4.1 Throughput improvements

Add --download-chunk-bytes option (default target: 262144).
Replace fixed 1 KiB streaming chunk with configurable larger chunk.
Verify memory usage remains bounded by worker count and chunk size.
Benchmark chunk-size/verification combinations for throughput vs CPU tradeoff.

4.2 Integrity checks

Add --verify-size/--no-verify-size option.
Add --verify-checksum/--no-verify-checksum option.
Validate downloaded file size against expected metadata.
Implement optional checksum validation strategy.
Store local checksum/result in state DB when enabled.

4.3 Range resume hardening

Keep .part resume behavior with Range requests.
Detect non-206 response when resuming and safely restart partial file.
Add corruption-safe handling for mismatched range behavior.

4.4 Verification

Unit tests for chunk-size configuration and defaults.
Unit tests for size verification success/failure.
Unit tests for checksum verification success/failure.
Integration tests for resume with partial files and range edge cases.

5. Request volume and enumeration efficiency

Add --album-page-size option (target range: 50-500).
Add --no-remote-count option to skip expensive album count calls.
Reduce redundant metadata queries where possible.
Add/align chunked date-based run options (since/until added date behavior).
Document clear behavior differences between added-date and created-date usage.
Add tests for new pagination and remote-count toggles.

6. Observability and operations

6.1 Logging

Add structured JSON log mode.
Include stable fields (run_id, asset_id, attempt, http_status, etc.).
Ensure sensitive data redaction remains enforced.

6.2 Metrics and health

Add metrics endpoint or export path (if compatible with current stack).
Track throughput, retries, throttle events, queue depth, success gap.
Add low-disk-space warning/error classification.
Provide JSON stats snapshot output suitable for GUI wrappers (--metrics-json).

6.3 Alerts and notifications

Add alert condition for repeated throttling.
Keep MFA expiry notification path working with new engine.
Add docs for recommended operational thresholds.

7. Documentation and migration

Update CLI reference docs for all new options.
Add migration guide: stateless mode vs stateful mode.
Document compatibility and unchanged default behavior.
Document concurrency limitations and safe defaults.
Add troubleshooting guide for throttling/session issues.

9. Runtime Semantics and Operability Hardening

9.1 Mode contract

Define explicit legacy/stateless mode contract (no DB required, filesystem skip semantics).
Define explicit stateful engine mode contract (resume guarantees, task-state semantics).
Add integration tests asserting mode-specific behavior and parity expectations.

9.2 Exit and summary semantics

Define process exit code contract (success, partial success, fatal auth/config, cancelled, stalled).
Emit machine-readable end-of-run summary with totals/failures/error location hints.

9.3 Cancellation and shutdown

Handle SIGINT/SIGTERM with graceful stop (drain or safe requeue of in-flight work).
Ensure clean shutdown is distinguishable from crash and restart behavior is deterministic.

9.4 State DB growth and retention

Add DB retention/pruning policy (completed task cleanup / capped error history).
Document and/or automate WAL checkpointing and vacuum guidance.

8. Final validation before release

Full test suite passes.
New tests added for each new subsystem.
Lint/type checks pass.
Manual end-to-end dry run on small sample library.
Manual end-to-end run with injected transient failures.
Confirm no regressions in naming/dedup/folder behavior.
Confirm watch mode behavior is unchanged unless explicitly modified.

Footer

© 2026 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Community
Docs
Contact

You can’t perform that action at this time.