Skip to content

feat(data): allow Software Heritage IP addresses#1542

Open
pabs3 wants to merge 1 commit intoTecharoHQ:mainfrom
pabs3:allow-software-heritage-IP-addresses
Open

feat(data): allow Software Heritage IP addresses#1542
pabs3 wants to merge 1 commit intoTecharoHQ:mainfrom
pabs3:allow-software-heritage-IP-addresses

Conversation

@pabs3
Copy link
Copy Markdown

@pabs3 pabs3 commented Mar 26, 2026

Software Heritage is similar to the Internet Archive (which is allowed) but is focused on archiving source code from forges instead of pages from websites.

Software Heritage processes do not use generic web crawling, but instead forge-specific and VCS-specific tools that are designed to use as little resources as possible, especially using incremental pulls, using/adding APIs and ignoring repositories that have not changed.

https://github.com/TecharoHQ/anubis/pulls/276
https://www.softwareheritage.org/
https://www.softwareheritage.org/software-heritage-faq/
https://docs.softwareheritage.org/user/faq/
https://gitlab.softwareheritage.org/swh
https://gitlab.softwareheritage.org/swh/infra/add-forge-now-requests/-/merge_requests/4

Disclosure: I am currently a contractor with Software Heritage.

Checklist:

  • Added a description of the changes to the [Unreleased] section of docs/docs/CHANGELOG.md
  • Added test cases to the relevant parts of the codebase
  • Ran integration tests npm run test:integration (unsupported on Windows, please use WSL)
  • All of my commits have verified signatures

Software Heritage is similar to the Internet Archive (which is allowed)
but is focused on archiving source code from forges
instead of pages from websites.

Software Heritage processes do not use generic web crawling,
but instead forge-specific and VCS-specific tools that
are designed to use as little resources as possible,
especially using incremental pulls, using/adding APIs
and ignoring repositories that have not changed.

See-also: https://github.com/TecharoHQ/anubis/pulls/276
See-also: https://www.softwareheritage.org/
See-also: https://www.softwareheritage.org/software-heritage-faq/
See-also: https://docs.softwareheritage.org/user/faq/
See-also: https://gitlab.softwareheritage.org/swh
See-also: https://gitlab.softwareheritage.org/swh/infra/add-forge-now-requests/-/merge_requests/4
@pabs3 pabs3 force-pushed the allow-software-heritage-IP-addresses branch from c5f4e46 to 79cab4f Compare March 26, 2026 02:32
@pabs3 pabs3 changed the title Allow Software Heritage IP addresses feat(data): allow Software Heritage IP addresses Mar 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant