CKAN 2.11 migration#17
Merged
Merged
Conversation
Updated Python versions and GitHub Actions versions in the workflow file. Changed container image references for CKAN versions.
- Fixed E721 flake8 error by using isinstance() instead of type comparison - Removed --user root from container options which was causing permission issues Co-authored-by: ChasNelson1990 <7795189+ChasNelson1990@users.noreply.github.com>
- Restored --user root option to fix checkout permissions - Added harvester initdb command to initialize harvest tables Co-authored-by: ChasNelson1990 <7795189+ChasNelson1990@users.noreply.github.com>
- Fixed dependency compatibility issue between pyOpenSSL and cryptography - Added version constraints to ensure compatible library versions Co-authored-by: ChasNelson1990 <7795189+ChasNelson1990@users.noreply.github.com>
- Updated pyOpenSSL from 18.0.0 to >=23.2.0 in requirements.txt - Added cryptography>=41.0.0,<43.0.0 constraint to requirements.txt - Removed redundant version pins from workflow (now in requirements.txt) Co-authored-by: ChasNelson1990 <7795189+ChasNelson1990@users.noreply.github.com>
- Removed urllib3.contrib.pyopenssl import and inject_into_urllib3() call - This fixes compatibility with cryptography>=42.0.0 - Modern Python and urllib3 have built-in SSL/TLS support Co-authored-by: ChasNelson1990 <7795189+ChasNelson1990@users.noreply.github.com>
- Removed @patch decorators for pyopenssl.inject_into_urllib3 in test_ckanharvester.py - These mocks are no longer needed since we removed the pyopenssl dependency Co-authored-by: ChasNelson1990 <7795189+ChasNelson1990@users.noreply.github.com>
Fix lint errors and dependency compatibility issues in test workflow
- Update CI workflow to test CKAN 2.11 with Python 3.10 only - Fix SQLAlchemy table existence checks using Inspector pattern - Add bind=engine parameter to all table and index creation calls - Replace nose test assertions with standard Python assertions in test_timeouts.py This resolves database initialization errors during plugin configuration and allows tests to run successfully under CKAN 2.11.
In CKAN 2.11, group IDs are auto-generated and cannot be specified during creation. Updated the harvester to: - Remove 'id' field from remote group data before calling group_create - Capture the created group with its new auto-generated ID - Use the new ID when adding groups to datasets Updated test to search for groups by name instead of remote ID, since the local group will have a different auto-generated ID. Test: test_remote_groups_create now passes
Fixed three tests that were failing due to CKAN 2.11 not allowing custom group IDs: - test_remote_groups_only_local - test_default_groups - test_default_groups_invalid Changes: - Remove custom id parameter from Group() factory calls - Update assertions to match groups by name instead of ID - Fix syntax bug in test_default_groups assertion - Update test.yml to run all 4 fixed group-related tests All four group-related tests now pass.
- Changed source_type from 'test' to 'test-for-action' (correct registered harvester) - Fixed assertion syntax for proper value comparison - Updated PROGRESS.md with debugging details
- Pass correct package id to package_update auth check
- Changed data_dict to {'id': pkg.id} for proper authorization
- Fix pytest compatibility: setup() to setup_method()
- Fix Flask REMOTE_USER: remove .encode('ascii'), use strings
- Add custom package_create auth to handle CKAN 2.11 flow
- Load user from Flask request.environ when context is empty
- Allow sysadmins to view forms (empty data_dict)
- Delegate to CKAN default auth for other cases
This fixes the 403 Forbidden error when accessing /harvest/new in
CKAN 2.11, where authorization checks happen before the view renders
and before the user is loaded from REMOTE_USER into context.
…ut __module__ - Updated _get_logic_functions to safely check for __module__ attribute before accessing it - Added exception handling for RuntimeError (Flask LocalProxy) and AttributeError - Prevents errors when iterating through module dicts containing request object or imported modules - Fixes test_edit_form_is_rendered
- Added custom package_update in logic/auth/update.py - Handles same Flask request context issue as package_create - Loads user from Flask environ when missing from context - Allows sysadmins to view harvest source edit forms - Updated test workflow to run all tests
- Changed logic_functions={} to logic_functions=None
- Initialize empty dict inside function to avoid sharing state
- Prevents auth functions from corrupting action functions registry
- Fixes 74 test failures caused by function registry corruption
- Remove unused imports (ckan.authz, ckan.model) - Remove trailing whitespace from blank lines - Remove unused variables (group1, group3) - Fix None comparison to use 'is None' instead of '== None'
- Extract shared user loading logic into load_user_from_flask_request() helper - Move Flask import to module-level in auth/__init__.py for consistency - Add explanatory comments to empty except clauses - Improve documentation for fragile conditions in package_create/update - Use getattr() for safe sysadmin attribute access (handles AnonymousUser) Addresses PR review feedback on code quality and maintainability.
Update url_for calls from harvest_* to harvester.* format: - harvest_refresh → harvester.refresh - harvest_job_abort → harvester.job_abort - harvest_clear → harvester.clear
There was a problem hiding this comment.
Pull request overview
This PR updates ckanext-harvest for CKAN 2.11 compatibility, addressing test failures and runtime issues introduced by CKAN’s Flask-based stack and updated dependency/DB initialization behaviors.
Changes:
- Update CI workflow to run against CKAN 2.11 / Python 3.10 and initialize harvester DB tables in CI.
- Adjust auth + plugin logic to handle CKAN 2.11/Flask request-context and auth timing changes.
- Update harvester behavior and tests for CKAN 2.11 differences (eg auto-generated group IDs) and pytest assertions.
Reviewed changes
Copilot reviewed 13 out of 14 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| requirements.txt | Updates OpenSSL-related dependencies (pyOpenSSL/cryptography). |
| ckanext/harvest/tests/test_timeouts.py | Removes nose dependency and converts assertions to pytest-style. |
| ckanext/harvest/tests/test_blueprint.py | Fixes pytest setup hook and Flask REMOTE_USER handling. |
| ckanext/harvest/tests/test_action.py | Fixes harvester type and assertion logic in update test. |
| ckanext/harvest/tests/harvesters/test_ckanharvester.py | Updates tests for CKAN 2.11 group ID behavior and removes pyopenssl mocking. |
| ckanext/harvest/templates/source/admin_base.html | Updates URLs to use Flask blueprint route names. |
| ckanext/harvest/plugin/init.py | Fixes mutable default argument bug and guards _get_logic_functions against Flask LocalProxy issues. |
| ckanext/harvest/model/init.py | Switches table existence checks to SQLAlchemy Inspector and binds engine on create/index operations. |
| ckanext/harvest/logic/auth/update.py | Adds wrapper package_update auth to address CKAN 2.11 form-view authorization timing. |
| ckanext/harvest/logic/auth/create.py | Adds wrapper package_create auth and fixes harvest_job_create auth check input. |
| ckanext/harvest/logic/auth/init.py | Adds Flask-aware helper for loading user from request environ. |
| ckanext/harvest/harvesters/ckanharvester.py | Removes pyopenssl injection and fixes remote group creation to handle CKAN 2.11 ID constraints. |
| PROGRESS.md | Adds migration progress log/notes. |
| .github/workflows/test.yml | Updates CI to CKAN 2.11 / Python 3.10 and adds harvester initdb step. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The CKAN 2.11 auth overrides used "empty data_dict" / "missing name" as a proxy for "harvest form GET render". Those predicates also fire for any other caller that happens to pass a sparse data_dict, silently bypassing CKAN's normal package_create / package_update auth chain (and any other plugin's hooks) for sysadmins. Replace the proxy with is_harvest_form_view(), which checks the actual Flask request: GET method + path matching /harvest/new or /harvest/edit/<id>. Background jobs, CLI, harvester import_stage calls, and any non-form callers fall through to CKAN's default auth. Also: only inject auth_user_obj for users with state == 'active', so a deleted/blocked account whose name lingers in REMOTE_USER (e.g. from a still-valid session cookie) does not retain auth privileges. Mirrors CKAN's own _get_user contract. Untrack PROGRESS.md.
Evaluating Flask's `request` LocalProxy outside an active request context (background queue, CLI) raises RuntimeError. The previous `if not request:` checks in load_user_from_flask_request() and is_harvest_form_view() sat outside the inner try/except, so the error would propagate and crash the auth call instead of falling through to CKAN's default auth. Switch both checks to flask.has_request_context(), with a no-op stub in the ImportError fallback path.
ChasNelson1990
approved these changes
Apr 21, 2026
ChasNelson1990
left a comment
Member
There was a problem hiding this comment.
Makes sense. Let's merge.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Brings ckanext-harvest to CKAN 2.11 / Python 3.10 compatibility. Builds on top of the work in #16 with additional bug fixes and a tightened auth model for the form-view overrides.
Changes
CKAN 2.11 migration
.exists()with Inspector pattern, bind engine on table/index.create()callsnose, renamesetup()→setup_method(), fix REMOTE_USER encoding (string, not bytes)id(auto-generated in 2.11), match groups by name in testsLocalProxyand modules without__module__in_get_logic_functions; fix mutable-default-arg bug that was sharing the action/auth registry dictsurllib3.contrib.pyopensslinjectionAuth (form-view overrides)
package_create/package_updateauth to handle CKAN 2.11's pre-render auth checks where the user isn't yet loaded into contextload_user_from_flask_request()helper loads the user fromrequest.environ['REMOTE_USER']when the context is empty, and only injectsauth_user_objwhenstate == 'active'is_harvest_form_view()— anchored on the actual Flask request path (GET on/harvest/newor/harvest/edit/<id>) rather than the original "empty/sparsedata_dict" proxy, which was over-broad and could silently skip CKAN's normal auth chain for unrelated callersTest plan
test_blueprint::test_new_form_is_rendered— sysadmin can GET/harvest/newtest_blueprint::test_edit_form_is_rendered— sysadmin can GET/harvest/edit/<id>