Thank you for your interest in contributing!
We welcome contributions from everyone. This document outlines the guidelines for how to contribute effectively and
respectfully.
We appreciate various kinds of contributions, including but not limited to:
- 🛠️ Code contributions (bug fixes, performance improvements, new features)
- 🐞 Bug reports
- 💡 Feature suggestions
- ❓ Questions and discussions
- 📚 Improving documentation
If you have questions:
- Check the README and existing issues first.
- If your question hasn't been addressed, open a new issue using the
Questionlabel.
When reporting a bug, please include the following:
- A clear and descriptive title
- Steps to reproduce the issue
- Expected vs actual behavior
- Environment info (OS, version, etc.)
- Logs or screenshots if available
Use the Bug Report issue template when creating the issue.
To suggest a new feature:
- Search existing issues to avoid duplicates.
- If it's new, open a new issue using the Feature Request template.
- Describe your idea, use cases, and possible alternatives.
-
Fork the repository.
-
Clone your fork:
git clone https://github.com/your-username/opendataloader-pdf.git cd opendataloader-pdf -
Create a feature branch:
git checkout -b my-feature
-
Build the project:
Prerequisites: Java 11+, Maven, Python 3.10+, uv, Node.js 20+, pnpm See the Development Workflow guide for OS-specific install instructions.
# Build Java packages npm run build-java # If you changed CLI options in Java, sync bindings (regenerates options.json, Python/Node.js wrappers) npm run sync
Important: If you modified any CLI options in Java, you must run
npm run syncbefore committing. This regeneratesoptions.jsonand all Python/Node.js bindings. Forgetting this silently breaks the wrappers. -
Make your changes and commit them.
-
Push your branch:
git push origin my-feature
-
Open a Pull Request (PR) against the
mainbranch. -
Respond to review comments and update your PR as needed.
- Follow existing code conventions.
- Run linters/formatters before committing.
- Write unit tests for any new or changed logic.
- Run
./scripts/bench.shbefore submitting a PR — CI will fail if benchmark scores drop below thresholds. - Keep your changes minimal and focused.
Use the following format:
<type> <short summary>
- Add: New feature
- Fix: Bug fix
- Update: Code update
Depending on your contribution, we may ask you to sign:
- CLA – Contributor License Agreement
- DCO – Developer Certificate of Origin
To sign the DCO, add Signed-off-by to your commit message:
git commit -s -m "your message"
Make sure your Git config contains your real name and email.
Thank you again for helping us improve this project! 🙌 If you have any questions, open an issue or join the discussion.