Skip to content

fix: move tickets email search to ch#58692

Merged
veryayskiy merged 11 commits into
masterfrom
fix/tickets-ch-email
May 18, 2026
Merged

fix: move tickets email search to ch#58692
veryayskiy merged 11 commits into
masterfrom
fix/tickets-ch-email

Conversation

@veryayskiy

Copy link
Copy Markdown
Contributor

For raising tickets from the app we need a lookup for a person email when distinct_id is not sent.

Removing the postgres search and using only ch.

@github-actions

github-actions Bot commented May 17, 2026

Copy link
Copy Markdown
Contributor

🎭 Playwright didn't run on this PR — your changes touch code that could affect E2E behavior, but Playwright is opt-in via label now to keep CI cost down.

Add the run-playwright label if you want an E2E sweep before merging — CI will pick it up automatically.

Most PRs don't need this. Real regressions still get caught on master and fix-forward.

@greptile-apps

greptile-apps Bot commented May 17, 2026

Copy link
Copy Markdown
Contributor
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
products/conversations/backend/api/tickets.py:104-110
**Email key used for mapping may not match the queried email**

`matched = prop_email or prop_Email or prop_dollar_email` picks the first non-empty value, but the WHERE clause can fire on any of the three columns. If a person has `properties.email = "personal@gmail.com"` and `properties.Email = "work@corp.com"`, and a ticket has `email_from = "work@corp.com"`, the row is returned by the `properties.Email` branch—but `matched` resolves to `"personal@gmail.com"`, so `email_to_uuid["personal@gmail.com"]` is stored. The final `email_to_person.get("work@corp.com")` call then returns `None`, silently dropping the match. The original Postgres query didn't have this issue because it returned the matching `Person` object directly via an OR filter. A fix would be to register the UUID under every non-empty email property, not just the first one, then intersect with the requested `emails` set.

Reviews (1): Last reviewed commit: "fix: move tickets email search to ch" | Re-trigger Greptile

Comment thread products/conversations/backend/api/tickets.py Outdated
@veria-ai

veria-ai Bot commented May 17, 2026

Copy link
Copy Markdown

PR overview

Moves ticket email person lookup to HogQL

This PR replaces the previous email-property person lookup with a batched ClickHouse/HogQL query for conversations tickets. I checked the changed API paths, query construction, and follow-up person fetches; the lookup is team-scoped and uses HogQL placeholders rather than interpolating email values.

Security review

  • No new security issues were flagged in the latest review.
  • No review issues remain open on this pull request.

Risk: 2/10

Comment thread products/conversations/backend/api/tickets.py
Comment thread products/conversations/backend/api/tickets.py Outdated
@greptile-apps

greptile-apps Bot commented May 17, 2026

Copy link
Copy Markdown
Contributor
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
products/conversations/backend/api/tickets.py:82-95
The emails passed to the ClickHouse `IN` clause are the raw, un-lowercased `email_from` values, but ClickHouse string comparisons via `IN` are case-sensitive. If `email_from` contains any upper-case characters (e.g. `"User@Example.com"` from an RFC-2047 `From:` header) while the person's stored property is lowercase, the `WHERE` clause will not match—even though the Python-side `emails_set` and result-dict keys are already normalised to lowercase. Passing lowercase emails to the query and using `lower()` in the WHERE clause makes the whole lookup consistently case-insensitive.

```suggestion
    emails_lower = [e.lower() for e in emails]
    with tags_context(product=Product.CONVERSATIONS, feature=Feature.QUERY):
        response = execute_hogql_query(
            """
            SELECT
                id,
                properties.email,
                properties.Email,
                properties.$email
            FROM persons
            WHERE lower(toString(properties.email)) IN {emails}
               OR lower(toString(properties.Email)) IN {emails}
               OR lower(toString(properties.$email)) IN {emails}
            """,
            placeholders={"emails": ast.Constant(value=emails_lower)},
```

Reviews (2): Last reviewed commit: "fix" | Re-trigger Greptile

Comment thread products/conversations/backend/api/tickets.py Outdated
@greptile-apps

greptile-apps Bot commented May 17, 2026

Copy link
Copy Markdown
Contributor
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
products/conversations/backend/api/tickets.py:104
`emails_set` is re-derived from `emails` using the same lowercasing that already produced `emails_lower`, violating the OnceAndOnlyOnce simplicity rule. Use `set(emails_lower)` to avoid the redundant comprehension.

```suggestion
    emails_set = set(emails_lower)
```

### Issue 2 of 2
products/conversations/backend/api/tickets.py:488-495
The `if t.email_from` guard inside the loop and the list comprehension are redundant — `unmatched` is already filtered to only include tickets where `t.email_from` is truthy (line 486). Removing the guards reduces superfluous parts.

```suggestion
        if unmatched:
            emails = [t.email_from for t in unmatched]
            email_to_person = _get_persons_by_email(self.team, emails)
            for ticket in unmatched:
                found = email_to_person.get(ticket.email_from.lower())
                if found is not None:
                    ticket.person = found
```

Reviews (3): Last reviewed commit: "fix" | Re-trigger Greptile

@greptile-apps

greptile-apps Bot commented May 17, 2026

Copy link
Copy Markdown
Contributor
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
products/conversations/backend/api/tickets.py:104
`emails_set` recomputes the lower-case operation over `emails` even though `emails_lower` already holds exactly that data. Use `set(emails_lower)` to avoid computing `.lower()` twice on the same input — consistent with the OnceAndOnlyOnce principle.

```suggestion
    emails_set = set(emails_lower)
```

### Issue 2 of 2
products/conversations/backend/api/tickets.py:489
The `if t.email_from` guard here is redundant — `unmatched` is already filtered to only include tickets where `t.email_from` is truthy. The extra check is a superfluous part that adds noise without protecting against anything new.

```suggestion
            emails = [t.email_from for t in unmatched]
```

Reviews (4): Last reviewed commit: "fix" | Re-trigger Greptile

@veryayskiy veryayskiy added the stamphog Request AI review from stamphog label May 17, 2026
github-actions[bot]
github-actions Bot previously approved these changes May 17, 2026

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No showstoppers. The change replaces a per-row ORM-based person-email lookup (which bypassed personhog routing with a nosemgrep comment) with a batched ClickHouse HogQL query, enabling the previously commented-out email fallback. All resolved review concerns (case sensitivity, batching) are addressed in the diff, and comprehensive tests are included.

Comment thread products/conversations/backend/api/tickets.py Outdated
@github-actions github-actions Bot dismissed their stale review May 17, 2026 15:23

New commits pushed (delta classified non_trivial_delta) — stamphog approval dismissed; re-review running automatically.

@greptile-apps

greptile-apps Bot commented May 17, 2026

Copy link
Copy Markdown
Contributor
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
products/conversations/backend/api/tickets.py:484
Redundant guard in `emails` list comprehension — `unmatched` already filters for `t.email_from` being truthy (line 481), so the inner `if t.email_from` is superfluous. The list could also benefit from deduplication to avoid issuing CH-side lookups for repeated addresses.

```suggestion
            emails = list({t.email_from for t in unmatched})
```

Reviews (5): Last reviewed commit: "email only" | Re-trigger Greptile

github-actions[bot]
github-actions Bot previously approved these changes May 17, 2026

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No showstoppers. The change replaces a nosemgrep-bypassing ORM person lookup with a batched ClickHouse HogQL query. The key reviewer concern (robbie-c flagging the 3-variant email property query as missing skip indexes) is addressed in the current diff — only properties.email is queried. Case sensitivity is handled with lower(), and get_persons_by_uuids properly routes through personhog.

@robbie-c

Copy link
Copy Markdown
Member

It's prob worth still adding the test to make sure that _get_persons_by_email is hitting the right index. Try this prompt:

Add a test to make sure that _get_persons_by_email is making use of the skip index on email.

Thread through an extra arg into _get_persons_by_email so that we use HogQLQuerySettings.force_data_skipping_indices in the the clickhouse query. Use with materialized(create_ngram_lower_index=True) in that test to materialize the email property with an ngram skip index, then use get_ngram_lower_index_name to get the index name, to pass in to force_data_skipping_indices.

Wrap the test in snapshot_clickhouse_queries so that we can check actual SQL that was run.

It might be easier to just modify an existing test to do the extra check around indexes.

@github-actions github-actions Bot dismissed their stale review May 18, 2026 06:42

New commits pushed (delta classified non_trivial_delta) — stamphog approval dismissed; re-review running automatically.

@greptile-apps

greptile-apps Bot commented May 18, 2026

Copy link
Copy Markdown
Contributor
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
products/conversations/backend/api/tickets.py:490-497
Superfluous guards — both the `if t.email_from` in the list comprehension and `if ticket.email_from:` in the loop body are always true here because `unmatched` already requires `t.email_from` to be truthy. Removing them makes the intent clearer.

```suggestion
        if unmatched:
            emails = [t.email_from for t in unmatched]
            email_to_person = _get_persons_by_email(self.team, emails)
            for ticket in unmatched:
                found = email_to_person.get(ticket.email_from.lower())
                if found is not None:
                    ticket.person = found
```

Reviews (6): Last reviewed commit: "extra test" | Re-trigger Greptile

github-actions[bot]
github-actions Bot previously approved these changes May 18, 2026

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaces a nosemgrep-bypassing ORM person lookup with a batched ClickHouse HogQL query routed through the proper personhog helper. All substantive reviewer concerns (case sensitivity, 3-variant email properties, skip index performance) are addressed in the current diff, and comprehensive tests are included.

@github-actions github-actions Bot dismissed their stale review May 18, 2026 07:01

New commits pushed (delta classified non_trivial_delta) — stamphog approval dismissed; re-review running automatically.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removes a nosemgrep-bypassing ORM person lookup and replaces it with a properly batched ClickHouse HogQL query. All substantive reviewer concerns (3-variant email properties and case sensitivity) are resolved in the current diff, get_persons_by_uuids correctly routes through personhog, and comprehensive tests are included.

Comment thread products/conversations/backend/api/tests/test_tickets.py Outdated
@greptile-apps

greptile-apps Bot commented May 18, 2026

Copy link
Copy Markdown
Contributor
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
products/conversations/backend/api/tickets.py:490-492
The `emails` list passed into `_get_persons_by_email` can contain duplicate values when multiple unmatched tickets share the same `email_from`. These duplicates propagate into `emails_lower` and appear verbatim in the ClickHouse `IN` clause. It doesn't break correctness, but deduplicating before calling avoids unnecessary bloat in the query.

```suggestion
        if unmatched:
            emails = list({t.email_from.lower() for t in unmatched if t.email_from})
            email_to_person = _get_persons_by_email(self.team, emails)
```

Reviews (7): Last reviewed commit: "fix test" | Re-trigger Greptile

@stamphog

stamphog Bot commented May 18, 2026

Copy link
Copy Markdown

Retaining stamphog approval — delta since last review classified as trivial_paths.

@stamphog

stamphog Bot commented May 18, 2026

Copy link
Copy Markdown

Retaining stamphog approval — delta since last review classified as trivial_paths.

@stamphog

stamphog Bot commented May 18, 2026

Copy link
Copy Markdown

Retaining stamphog approval — delta since last review classified as trivial_paths.

@tests-posthog

tests-posthog Bot commented May 18, 2026

Copy link
Copy Markdown
Contributor

Query snapshots: Backend query snapshots updated

Changes: 1 snapshots (0 modified, 1 added, 0 deleted)

What this means:

  • Query snapshots have been automatically updated to match current output
  • These changes reflect modifications to database queries or schema

Next steps:

  • Review the query changes to ensure they're intentional
  • If unexpected, investigate what caused the query to change

Review snapshot changes →

@stamphog

stamphog Bot commented May 18, 2026

Copy link
Copy Markdown

Retaining stamphog approval — delta since last review classified as trivial_paths.

@veryayskiy veryayskiy merged commit fa4709f into master May 18, 2026
214 checks passed
@veryayskiy veryayskiy deleted the fix/tickets-ch-email branch May 18, 2026 09:34
@deployment-status-posthog

deployment-status-posthog Bot commented May 18, 2026

Copy link
Copy Markdown

Deploy status

Environment Status Deployed At Workflow
dev ✅ Deployed 2026-05-18 11:46 UTC Run
prod-us ✅ Deployed 2026-05-18 11:57 UTC Run
prod-eu ✅ Deployed 2026-05-18 12:14 UTC Run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stamphog Request AI review from stamphog

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants