Skip to content

feat: Emit warning with Diagnostic when doing = Null #15696

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

changsun20
Copy link
Contributor

Which issue does this PR close?

Closes #14434

Rationale for this change

This PR addresses a common SQL anti-pattern where users accidentally use = NULL instead of IS NULL. While syntactically valid, this comparison always returns NULL in SQL and often indicates a developer mistake. The changes help users identify this pitfall through rich warnings while maintaining full query execution capabilities.

What changes are included in this PR?

  1. Added warning detection for = NULL comparisons in predicate contexts
  2. Implemented span-based diagnostics highlighting the problematic expression
  3. Enhanced SQL parser integration with upgraded sqlparser dependency (0.55+)
  4. Warning collection plumbing using non-intrusive RefCell storage
  5. Added help messages suggesting IS NULL alternative
  6. Test coverage for single and multiple occurrences

Are these changes tested?

Yes, this PR includes:

  • Unit tests for single = NULL detection
  • Validation of multiple warnings in complex expressions
  • Span position verification
  • Help message content checks

Are there any user-facing changes?

Yes, but non-breaking. No API changes or behavior modifications - existing queries will still execute normally but may produce additional warnings.

@github-actions github-actions bot added the sql SQL Planner label Apr 13, 2025
@changsun20
Copy link
Contributor Author

Hi @eliaperantoni,

Thank you for your patience and guidance throughout this issue. I've implemented the core functionality per our discussions, but would like to confirm a few implementation details:

  1. Predicate Context Validation
    The warning detection is integrated during BinaryExpr processing, which should naturally limit it to predicate contexts. Statements like UPDATE users SET password = NULL won't trigger false warnings by default. Could you confirm this approach is acceptable?
  2. Span Handling Strategy
    In usual cases, the left operand is an Identifier. The current implementation combines the identifier's left span with NULL's right span for precise highlighting. For rare non-identifier cases (e.g., some complex expressions that I can't immediately come up with one right now), we fall back to using just the NULL span. This balances precision with robustness.
  3. Test Coverage Request
    While I've added tests for basic and multiple = NULL cases, could you suggest any edge cases or additional scenarios that should be validated?

I appreciate your expertise in reviewing these implementation choices. Please let me know if any adjustments are needed.

Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @changsun20 wondering if its possible to test those warnings in integration slt test files?

@changsun20
Copy link
Contributor Author

Thanks @changsun20 wondering if its possible to test those warnings in integration slt test files?

Thank you for the thoughtful question, @comphead. I appreciate your focus on validation through integration tests. The current implementation prioritizes unit testing for the diagnostic infrastructure itself, as the warnings are collected internally via SqlToRel and not yet exposed through user-facing APIs. This approach allows us to validate the core logic while minimizing disruption to existing systems.

I fully agree that end-to-end validation through sqllogictest becomes critical as we evolve toward a stable warning reporting interface. Building on the foundation from #14429, I'm committed to driving the design of a unified API surface for diagnostic propagation that would benefit both this implementation and future error reporting improvements. A follow-up ticket seems ideal to address these aspects systematically, ensuring we maintain both test coverage and architectural clarity.

If the community prefers earlier integration testing, I'm happy to explore interim solutions - perhaps a temporary hook for test validation. Let me know your preference, and I'll adapt accordingly.

@changsun20 changsun20 force-pushed the feat/emit-warning-for-=-null branch from b7c80a2 to 85ecef1 Compare April 13, 2025 21:02
@comphead
Copy link
Contributor

Yeah, that was actually my question having the warnings without being returned to the end user, who is supposed to react on the warnings? 🤔

@changsun20
Copy link
Contributor Author

@comphead I understand your concern. If displaying warnings to end users is what you'd like to see in this PR, could you confirm if @eliaperantoni's proposed solution in #14434 of "replacing Result with DatafusionResult" aligns with what you're thinking?

My concern is that this approach might be too invasive, since we're dealing with warnings that should be passed to the end without interrupting execution, rather than immediate errors. However, as discussed in the issue, this could be more robust long-term.

Please share your thoughts and preference. If this is the direction the community chooses, I'll convert this PR to draft status and implement that approach later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sql SQL Planner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Emit warning with attached Diagnostic when doing = NULL
2 participants