Skip to content

[AAP-39842] Fix: Validate email addresses from authenticators before persisting#972

Open
bhavenst wants to merge 4 commits intodevelfrom
debuggernaut/AAP-39842
Open

[AAP-39842] Fix: Validate email addresses from authenticators before persisting#972
bhavenst wants to merge 4 commits intodevelfrom
debuggernaut/AAP-39842

Conversation

@bhavenst
Copy link
Copy Markdown
Contributor

@bhavenst bhavenst commented Apr 6, 2026

Summary

Autonomous fix for AAP-39842: Authenticators can create users with invalid emails.

Authenticators (SAML, LDAP, etc.) can provide invalid email addresses (e.g., UIDs without an @ sign) which bypass serializer validation and get stored directly on User objects. This causes 500 errors when downstream services (controller) reject the invalid email via their API on the /me endpoint.

Root Cause Analysis

The normalize_and_get_email() function in ansible_base/authentication/utils/user.py normalizes email input (strips whitespace, lowercases) but does not validate that the email is actually valid. Two code paths allow invalid emails to be persisted:

  1. Non-social auth plugins (LDAP, TACACS, Radius): get_or_create_authenticator_user() copies email from user_details directly into get_or_create() defaults without validation (line 345 of authentication.py).
  2. Social auth pipeline (SAML, OIDC, GitHub): social_core.pipeline.user.user_details updates user email directly from backend details, and capture_oauth_email_pipeline calls normalize_and_get_email which doesn't reject invalid formats.

Evidence

  • ansible_base/authentication/utils/user.py:11-26normalize_and_get_email strips/lowercases but never checks for @ or email validity
  • ansible_base/authentication/utils/authentication.py:345details["email"] passed straight to get_or_create defaults
  • ansible_base/authentication/utils/authentication.py:336email param on AuthenticatorUser also stored without validation
  • Developer comment on JIRA (Joe Shimkus): "Authenticators should perform validation of info retrieved... This can be accomplished by utilizing the USER model serializer to validate the info before storing in the database."

Changes Made

ansible_base/authentication/utils/user.py

  • Added _is_valid_email() helper using Django's built-in EmailValidator
  • Updated normalize_and_get_email() to reject invalid emails (returns None) with a warning log message that includes the invalid address and guidance to check authenticator email attribute mapping

ansible_base/authentication/utils/authentication.py

  • Added early validation of the email parameter in get_or_create_authenticator_user() so invalid emails from authenticators are cleaned before reaching AuthenticatorUser
  • Added validation of user_details["email"] before passing to User.objects.get_or_create() defaults

test_app/tests/authentication/utils/test_user.py

  • Added 16 parametrized test cases covering valid emails, invalid emails (no @, missing parts), empty inputs, non-string types, and list inputs
  • Added test asserting that invalid emails produce a warning log message

Rationale

Using Django's built-in EmailValidator rather than a simple @ check because:

  • It covers edge cases (missing domain, multiple @ signs, invalid characters)
  • It's the same validator used by Django model fields and serializers, ensuring consistency
  • It's well-tested and maintained as part of Django core

The fix is applied at the normalize_and_get_email level (centralized) so both social auth and non-social auth paths benefit from a single change.

Alternative Approaches Considered

  1. Validate in each authenticator plugin separately (LDAP, SAML, etc.): More targeted but requires changes in every plugin and is easy to miss for new plugins. Rejected in favor of centralized validation.
  2. Use the User model serializer (as suggested by Joe Shimkus): Would work but is heavier-weight — serializer validation includes many fields beyond email. The focused EmailValidator approach is simpler and more surgical.
  3. Add a Django model-level validator on the User email field: Would catch all paths but could break migrations or existing data with invalid emails. Too broad for this fix.

Assumptions

  • Invalid emails should result in an empty string on the User model (per the ticket's expected behavior), not a login failure
  • The warning log is sufficient for administrators to diagnose and fix their authenticator email attribute mapping
  • Existing users with invalid emails in the database are not retroactively fixed by this change (that would require a data migration)

Testing

  • Existing tests pass (no behavioral change for valid emails)
  • New parametrized tests cover valid, invalid, empty, and edge-case email inputs
  • Warning log test confirms observability for admins
  • Manual verification: Configure a SAML authenticator with email mapped to UID field, login as SSO user, verify user is created with empty email and no 500 errors

Risk Assessment

  • Confidence: High — the defect and fix are straightforward validation gaps
  • Scope: Narrow — only affects email handling in authenticator user creation
  • Side effects: Users who previously got invalid emails stored will now get empty emails instead. This is the desired behavior per the ticket.

This PR was created autonomously by Debuggernaut (Claude Code).
A human reviewer should verify the fix before merging.
If the approach is wrong but the analysis is useful, feel free to close this PR and use the analysis to inform a manual fix.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes

    • Enforced normalization and format validation of emails so invalid addresses are rejected and normalized values are used consistently for account creation and linking.
    • Improved account linking/return behavior to reliably use the correct linked account.
  • Tests

    • Added tests covering email normalization, invalid/rejection cases, and corresponding warning logs.

…sting

Authenticators (SAML, LDAP, etc.) can provide invalid email addresses
(e.g., UIDs without an @ sign) which bypass serializer validation and
get stored directly on User objects. This causes 500 errors when
downstream services (controller) reject the invalid email via their API.

Add email format validation to normalize_and_get_email() using Django's
built-in EmailValidator. Invalid emails are rejected with a warning log
and the user is created with an empty email instead. Also validate the
email parameter in get_or_create_authenticator_user() before it reaches
the User model defaults.

Debuggernaut autonomous fix. See PR description for full analysis.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 6, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 274818b2-eed0-401f-aee2-a19e2a74ba48

📥 Commits

Reviewing files that changed from the base of the PR and between ba7f7e1 and ef3e464.

📒 Files selected for processing (1)
  • ansible_base/authentication/utils/authentication.py

📝 Walkthrough

Walkthrough

Normalize and validate emails in authentication utilities; invalid emails are logged and coerced to empty. Persisted AuthenticatorUser and created User now use normalized email. Tests added for normalization behavior and logging.

Changes

Cohort / File(s) Summary
Email Validation & Normalization
ansible_base/authentication/utils/user.py
Add normalize_and_get_email validation (uses Django validate_email), return None for invalid/empty inputs, log a WARNING on rejection, and introduce _is_valid_email and module logger.
Authentication Integration
ansible_base/authentication/utils/authentication.py
Normalize email earlier in get_or_create_authenticator_user; use normalized value for AuthenticatorUser.email, get_or_create(..., email=...), and when populating User details; always return auth_user.user.
Tests
test_app/tests/authentication/utils/test_user.py
Add parametrized tests covering valid/invalid normalization inputs (strings, single-item lists, empties, non-strings) and a log-assertion ensuring WARNING is emitted for invalid emails.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: validating email addresses from authenticators before persisting them to the database.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch debuggernaut/AAP-39842

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ansible_base/authentication/utils/authentication.py`:
- Around line 5-8: Reorder the imports so third-party libraries (e.g., Django,
social_core, etc.) appear before first-party ansible_base imports; specifically
ensure any imports from external packages come first and then import
get_authenticator_class, Authenticator, AuthenticatorUser, AuthenticatorStorage,
AuthenticatorStrategy, and normalize_and_get_email from ansible_base, preserving
the same symbols but changing their order to satisfy linting import grouping
rules.

In `@ansible_base/authentication/utils/user.py`:
- Around line 1-11: Reorder the imports in
ansible_base/authentication/utils/user.py to satisfy linting: place Django and
other third-party imports (e.g., django.contrib.auth.models.AbstractUser,
django.core.exceptions.ValidationError, django.core.validators.validate_email)
before the local/first-party ansible_base imports (e.g.,
ansible_base.authentication.authenticator_plugins.utils.get_authenticator_plugin,
ansible_base.authentication.models.AuthenticatorUser,
ansible_base.lib.utils.models.is_system_user); keep the logger definition
(logger) unchanged and ensure typing and logging imports remain appropriately
grouped.

In `@test_app/tests/authentication/utils/test_user.py`:
- Around line 1-5: Reorder the imports in the test module to satisfy isort:
place standard library imports first (unittest.mock), then third-party imports
(pytest), then local application imports (from
ansible_base.authentication.models import AuthenticatorUser and from
ansible_base.authentication.utils.user import can_user_change_password,
normalize_and_get_email); ensure grouping and alphabetical ordering according to
the project's isort configuration so the import block matches other tests.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 26d51652-6cb6-44ce-b56e-8e6f434a929b

📥 Commits

Reviewing files that changed from the base of the PR and between 0f6f6c4 and 9b85062.

📒 Files selected for processing (3)
  • ansible_base/authentication/utils/authentication.py
  • ansible_base/authentication/utils/user.py
  • test_app/tests/authentication/utils/test_user.py

Comment thread ansible_base/authentication/utils/authentication.py Outdated
Comment thread ansible_base/authentication/utils/user.py
Comment thread test_app/tests/authentication/utils/test_user.py
bhavenst and others added 3 commits April 6, 2026 12:06
Separate third-party imports (django, social_core, pytest) from
first-party imports (ansible_base) per isort black profile.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
…r_user

Remove unnecessary ternary (auth_user is always set by Step 4) and
conditional log string to bring function under SonarCloud's limit of 15.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
The test_gocau_auth_user_needs_creation test asserts on the
'attaching to existing user' log substring. Keep the ternary for
the log but remove the unnecessary final_user guard instead.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 6, 2026

DVCS PR Check Results:

PR appears valid (JIRA key(s) found)

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented Apr 6, 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant