Skip to content

CKAN 2.11 migration#17

Merged
A-Souhei merged 25 commits into
developmentfrom
fix/ckan-2.11-bugfixes
Apr 27, 2026
Merged

CKAN 2.11 migration#17
A-Souhei merged 25 commits into
developmentfrom
fix/ckan-2.11-bugfixes

Conversation

@A-Souhei

@A-Souhei A-Souhei commented Apr 17, 2026

Copy link
Copy Markdown

Summary

Brings ckanext-harvest to CKAN 2.11 / Python 3.10 compatibility. Builds on top of the work in #16 with additional bug fixes and a tightened auth model for the form-view overrides.

Changes

CKAN 2.11 migration

  • SQLAlchemy 1.4: replace .exists() with Inspector pattern, bind engine on table/index .create() calls
  • pytest migration: drop nose, rename setup()setup_method(), fix REMOTE_USER encoding (string, not bytes)
  • Group creation: drop manual id (auto-generated in 2.11), match groups by name in tests
  • Templates: update harvest route names to the CKAN 2.11 endpoint format
  • Plugin: handle Flask LocalProxy and modules without __module__ in _get_logic_functions; fix mutable-default-arg bug that was sharing the action/auth registry dicts
  • Dependencies: bump pyOpenSSL/cryptography, drop deprecated urllib3.contrib.pyopenssl injection

Auth (form-view overrides)

  • Custom package_create / package_update auth to handle CKAN 2.11's pre-render auth checks where the user isn't yet loaded into context
  • load_user_from_flask_request() helper loads the user from request.environ['REMOTE_USER'] when the context is empty, and only injects auth_user_obj when state == 'active'
  • Form-view detection via is_harvest_form_view() — anchored on the actual Flask request path (GET on /harvest/new or /harvest/edit/<id>) rather than the original "empty/sparse data_dict" proxy, which was over-broad and could silently skip CKAN's normal auth chain for unrelated callers

Test plan

  • Full test suite passes locally
  • test_blueprint::test_new_form_is_rendered — sysadmin can GET /harvest/new
  • test_blueprint::test_edit_form_is_rendered — sysadmin can GET /harvest/edit/<id>
  • CI green on CKAN 2.11 / Python 3.10
  • Smoke-test in a downstream deployment (DHIS2 harvester import path unchanged — runs without a Flask request, so the form-view bypass never fires; falls through to CKAN's default auth as before)

ChasNelson1990 and others added 23 commits December 8, 2025 15:39
Updated Python versions and GitHub Actions versions in the workflow file. Changed container image references for CKAN versions.
- Fixed E721 flake8 error by using isinstance() instead of type comparison
- Removed --user root from container options which was causing permission issues

Co-authored-by: ChasNelson1990 <7795189+ChasNelson1990@users.noreply.github.com>
- Restored --user root option to fix checkout permissions
- Added harvester initdb command to initialize harvest tables

Co-authored-by: ChasNelson1990 <7795189+ChasNelson1990@users.noreply.github.com>
- Fixed dependency compatibility issue between pyOpenSSL and cryptography
- Added version constraints to ensure compatible library versions

Co-authored-by: ChasNelson1990 <7795189+ChasNelson1990@users.noreply.github.com>
- Updated pyOpenSSL from 18.0.0 to >=23.2.0 in requirements.txt
- Added cryptography>=41.0.0,<43.0.0 constraint to requirements.txt
- Removed redundant version pins from workflow (now in requirements.txt)

Co-authored-by: ChasNelson1990 <7795189+ChasNelson1990@users.noreply.github.com>
- Removed urllib3.contrib.pyopenssl import and inject_into_urllib3() call
- This fixes compatibility with cryptography>=42.0.0
- Modern Python and urllib3 have built-in SSL/TLS support

Co-authored-by: ChasNelson1990 <7795189+ChasNelson1990@users.noreply.github.com>
- Removed @patch decorators for pyopenssl.inject_into_urllib3 in test_ckanharvester.py
- These mocks are no longer needed since we removed the pyopenssl dependency

Co-authored-by: ChasNelson1990 <7795189+ChasNelson1990@users.noreply.github.com>
Fix lint errors and dependency compatibility issues in test workflow
- Update CI workflow to test CKAN 2.11 with Python 3.10 only
- Fix SQLAlchemy table existence checks using Inspector pattern
- Add bind=engine parameter to all table and index creation calls
- Replace nose test assertions with standard Python assertions in test_timeouts.py

This resolves database initialization errors during plugin configuration
and allows tests to run successfully under CKAN 2.11.
In CKAN 2.11, group IDs are auto-generated and cannot be specified
during creation. Updated the harvester to:
- Remove 'id' field from remote group data before calling group_create
- Capture the created group with its new auto-generated ID
- Use the new ID when adding groups to datasets

Updated test to search for groups by name instead of remote ID,
since the local group will have a different auto-generated ID.

Test: test_remote_groups_create now passes
Fixed three tests that were failing due to CKAN 2.11 not allowing
custom group IDs:
- test_remote_groups_only_local
- test_default_groups
- test_default_groups_invalid

Changes:
- Remove custom id parameter from Group() factory calls
- Update assertions to match groups by name instead of ID
- Fix syntax bug in test_default_groups assertion
- Update test.yml to run all 4 fixed group-related tests

All four group-related tests now pass.
- Changed source_type from 'test' to 'test-for-action' (correct registered harvester)
- Fixed assertion syntax for proper value comparison
- Updated PROGRESS.md with debugging details
- Pass correct package id to package_update auth check
- Changed data_dict to {'id': pkg.id} for proper authorization
- Fix pytest compatibility: setup() to setup_method()
- Fix Flask REMOTE_USER: remove .encode('ascii'), use strings
- Add custom package_create auth to handle CKAN 2.11 flow
- Load user from Flask request.environ when context is empty
- Allow sysadmins to view forms (empty data_dict)
- Delegate to CKAN default auth for other cases

This fixes the 403 Forbidden error when accessing /harvest/new in
CKAN 2.11, where authorization checks happen before the view renders
and before the user is loaded from REMOTE_USER into context.
…ut __module__

- Updated _get_logic_functions to safely check for __module__ attribute before accessing it
- Added exception handling for RuntimeError (Flask LocalProxy) and AttributeError
- Prevents errors when iterating through module dicts containing request object or imported modules
- Fixes test_edit_form_is_rendered
- Added custom package_update in logic/auth/update.py
- Handles same Flask request context issue as package_create
- Loads user from Flask environ when missing from context
- Allows sysadmins to view harvest source edit forms
- Updated test workflow to run all tests
- Changed logic_functions={} to logic_functions=None
- Initialize empty dict inside function to avoid sharing state
- Prevents auth functions from corrupting action functions registry
- Fixes 74 test failures caused by function registry corruption
- Remove unused imports (ckan.authz, ckan.model)
- Remove trailing whitespace from blank lines
- Remove unused variables (group1, group3)
- Fix None comparison to use 'is None' instead of '== None'
- Extract shared user loading logic into load_user_from_flask_request() helper
- Move Flask import to module-level in auth/__init__.py for consistency
- Add explanatory comments to empty except clauses
- Improve documentation for fragile conditions in package_create/update
- Use getattr() for safe sysadmin attribute access (handles AnonymousUser)

Addresses PR review feedback on code quality and maintainability.
Update url_for calls from harvest_* to harvester.* format:
- harvest_refresh → harvester.refresh
- harvest_job_abort → harvester.job_abort
- harvest_clear → harvester.clear
Copilot AI review requested due to automatic review settings April 17, 2026 13:02
@A-Souhei A-Souhei changed the title Fix/ckan 2.11 bugfixes CKAN 2.11 migration Apr 17, 2026

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates ckanext-harvest for CKAN 2.11 compatibility, addressing test failures and runtime issues introduced by CKAN’s Flask-based stack and updated dependency/DB initialization behaviors.

Changes:

  • Update CI workflow to run against CKAN 2.11 / Python 3.10 and initialize harvester DB tables in CI.
  • Adjust auth + plugin logic to handle CKAN 2.11/Flask request-context and auth timing changes.
  • Update harvester behavior and tests for CKAN 2.11 differences (eg auto-generated group IDs) and pytest assertions.

Reviewed changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
requirements.txt Updates OpenSSL-related dependencies (pyOpenSSL/cryptography).
ckanext/harvest/tests/test_timeouts.py Removes nose dependency and converts assertions to pytest-style.
ckanext/harvest/tests/test_blueprint.py Fixes pytest setup hook and Flask REMOTE_USER handling.
ckanext/harvest/tests/test_action.py Fixes harvester type and assertion logic in update test.
ckanext/harvest/tests/harvesters/test_ckanharvester.py Updates tests for CKAN 2.11 group ID behavior and removes pyopenssl mocking.
ckanext/harvest/templates/source/admin_base.html Updates URLs to use Flask blueprint route names.
ckanext/harvest/plugin/init.py Fixes mutable default argument bug and guards _get_logic_functions against Flask LocalProxy issues.
ckanext/harvest/model/init.py Switches table existence checks to SQLAlchemy Inspector and binds engine on create/index operations.
ckanext/harvest/logic/auth/update.py Adds wrapper package_update auth to address CKAN 2.11 form-view authorization timing.
ckanext/harvest/logic/auth/create.py Adds wrapper package_create auth and fixes harvest_job_create auth check input.
ckanext/harvest/logic/auth/init.py Adds Flask-aware helper for loading user from request environ.
ckanext/harvest/harvesters/ckanharvester.py Removes pyopenssl injection and fixes remote group creation to handle CKAN 2.11 ID constraints.
PROGRESS.md Adds migration progress log/notes.
.github/workflows/test.yml Updates CI to CKAN 2.11 / Python 3.10 and adds harvester initdb step.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread ckanext/harvest/logic/auth/__init__.py Outdated
Comment thread .github/workflows/test.yml
The CKAN 2.11 auth overrides used "empty data_dict" / "missing name"
as a proxy for "harvest form GET render". Those predicates also fire
for any other caller that happens to pass a sparse data_dict, silently
bypassing CKAN's normal package_create / package_update auth chain
(and any other plugin's hooks) for sysadmins.

Replace the proxy with is_harvest_form_view(), which checks the actual
Flask request: GET method + path matching /harvest/new or
/harvest/edit/<id>. Background jobs, CLI, harvester import_stage calls,
and any non-form callers fall through to CKAN's default auth.

Also: only inject auth_user_obj for users with state == 'active', so
a deleted/blocked account whose name lingers in REMOTE_USER (e.g. from
a still-valid session cookie) does not retain auth privileges. Mirrors
CKAN's own _get_user contract.

Untrack PROGRESS.md.
Evaluating Flask's `request` LocalProxy outside an active request context
(background queue, CLI) raises RuntimeError. The previous `if not request:`
checks in load_user_from_flask_request() and is_harvest_form_view() sat
outside the inner try/except, so the error would propagate and crash the
auth call instead of falling through to CKAN's default auth.

Switch both checks to flask.has_request_context(), with a no-op stub in
the ImportError fallback path.
@A-Souhei A-Souhei requested a review from ChasNelson1990 April 17, 2026 13:38

@ChasNelson1990 ChasNelson1990 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Let's merge.

@A-Souhei A-Souhei merged commit e3da85d into development Apr 27, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants