Skip to content

Conversation

@sandrawar
Copy link
Collaborator

@sandrawar sandrawar commented Jan 23, 2026

Motivation:

This PR introduces:

  1. GQL Grammar - ANTLR4 grammar defining syntax for a subset of the GQL (Graph Query Language) standard. This is the foundation for future GQL query support in YouTrackDB.
  2. Command Step for Gremlin - Adds sqlCommand() method to the Gremlin DSL for executing SQL/DDL command with proper support of lazy execution and chaining.

Changes:

  1. GQL Grammar (core/src/main/grammar/antlr/GQL.g4)
    ANTLR4 grammar defining GQL syntax with support for:
  • MATCH statement with graph patterns (nodes, edges, paths)
  • FILTER/WHERE clauses with boolean expressions
  • RETURN statement with projections
  • ORDER BY, SKIP, LIMIT for result ordering and pagination
  • LET statement for variable bindings
  • CALL statement for procedure invocation
  • GROUP BY with aggregate functions (COUNT, SUM, AVG, MIN, MAX)
  • Edge quantifiers for path patterns
  • Case insensitivity for keywords
  • HINTS for query optimization
    Note: This PR only includes the grammar definition and parsing tests. Query execution will be implemented in a future PR.
  1. GQL Grammar Tests (GqlStructureTest.java)
  • Positive tests validating parse tree structure (golden files for regression detection)
  • Negative tests for syntax error handling
  1. Command Step for Gremlin DSL
    YTDBGraphTraversalSourceDSL:
  • command(String) - Eager execution, returns void, auto-iterates (as it was)
  • sqlCommand(String) - Lazy execution, returns GraphTraversal for chaining
    YTDBCommandService:
  • TinkerPop service with SqlCommandFactory alias for grammar recognition
    Usage:
    // Single command (eager)g().command("CREATE CLASS Person EXTENDS V");
    // Chaining (lazy) g().sqlCommand("BEGIN").sqlCommand("INSERT INTO...").sqlCommand("COMMIT").iterate();
  1. sqlCommand Test
  • Command.feature - Cucumber tests using sqlCommand for chaining.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @sandrawar, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates a new Graph Query Language (GQL) parser using ANTLR4, establishing the foundation for processing GQL queries within the system. Concurrently, it significantly enhances the Gremlin command step to support robust transactional operations and command chaining, thereby improving the flexibility and reliability of direct database interactions. The changes are thoroughly validated with new test suites covering both the GQL parser and the updated command execution logic.

Highlights

  • GQL Parser Integration: Introduced an ANTLR4-based Graph Query Language (GQL) parser by adding the antlr4-maven-plugin to core/pom.xml and defining the GQL grammar in GQL.g4. This enables the system to parse GQL queries.
  • Enhanced Gremlin Command Chaining: The command methods in YTDBGraphTraversalSourceDSL.java now return a YTDBGraphTraversal, allowing for chaining of commands in Gremlin traversals. The YTDBCommandService was updated to support both Start and Streaming service types for this functionality.
  • Improved Transaction Management: The executeCommand method in YTDBGraphImplAbstract.java has been refactored to explicitly handle BEGIN, COMMIT, and ROLLBACK commands. It also now automatically manages transactions for other commands, ensuring atomicity by committing on success or rolling back on failure, and handles schema-modifying commands appropriately.
  • Comprehensive GQL Parser Testing: Added GqlStructureTest.java with parameterized JUnit tests to validate the GQL grammar. This includes a suite of positive and negative test cases, along with golden master files for expected parse tree outputs.
  • New Gremlin Command Feature Tests: A new Cucumber feature file, Command.feature, was added to test the g.command() functionality, including its new transaction management capabilities and schema operations.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new GQL parser using ANTLR and refactors the Gremlin command step to be chainable. The changes include adding the ANTLR maven plugin, the GQL grammar file, and extensive tests for the new parser. The command step is improved to support chaining, which enhances the DSL's usability for transactional operations. The related services and transaction handling logic have been updated accordingly. The changes are well-structured and include new integration tests for the command chaining feature. My feedback includes suggestions to fix typos in the grammar for better maintainability, improve test robustness by failing fast on missing test data, and centralize dependency version management in the pom.xml.

core/pom.xml Outdated
Comment on lines 132 to 134
<groupId>org.antlr</groupId>
<artifactId>antlr4-maven-plugin</artifactId>
<version>4.13.2</version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The ANTLR version 4.13.2 is hardcoded here. The same version is used for the antlr4-runtime dependency in the root pom.xml. To ensure these versions stay in sync and to make future updates easier, consider defining an antlr.version property in the root pom.xml and using ${antlr.version} in both places.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @sandrawar, please fix this issue.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines 11 to 14
simple_linear_query_statement: (primitive_query_statement)* return_statment?;
primitive_query_statement: call_statment | filter_statment | for_statment | let_statmnet |
limit_statment | match_statment | offset_statment | order_by_statment |
skip_statment | with_statmnet;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There are typos in several parser rule names (e.g., return_statment, let_statmnet, with_statmnet). They should be corrected to end with _statement for consistency and to generate more intuitive parser methods. Please apply this correction throughout the file where these rules are defined and used.

simple_linear_query_statement: (primitive_query_statement)* return_statement?;
primitive_query_statement: call_statment | filter_statment | for_statment | let_statement |
                           limit_statment | match_statment | offset_statment | order_by_statment |
                           skip_statment | with_statement;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sandrawar, please fix this issue too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines +84 to +88
if (!Files.exists(path)) {
System.err.println("WARNING: Directory does not exist: " + path);
return Stream.empty();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If the directory with test files doesn't exist, the test will silently pass because an empty stream is returned. This could hide issues in the build or test setup. It would be more robust to fail the test explicitly if the directory is not found.

    if (!Files.exists(path)) {
      throw new java.nio.file.NoSuchFileException("Test data directory not found: " + path);
    }

@andrii0lomakin
Copy link
Collaborator

Hi @sandrawar
While I am reviewing your PR please check the requirements for PR description and fill it in. You can find examples in our closed PR, like this one #610 .

Copy link
Collaborator

@andrii0lomakin andrii0lomakin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good, but requires some changes.

///
/// @param command The SQL command to execute.
/// @return A traversal that can be chained with other steps.
public <S> GraphTraversal<S, S> sqlCommand(@Nonnull String command) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do like quaility of comments.

/// @param command The SQL command to execute.
/// @param arguments The arguments to pass to the command.
/// @return A traversal that can be chained with other steps.
public <S> GraphTraversal<S, S> sqlCommand(@Nonnull String command, @Nonnull Map<?, ?> arguments) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is not tested in Gherkin tests , also I propose to use pairs of string, value as varargs here.


try {
session.command(command, params);
if (startedTx) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not autocommit after each session it is controlled by the user

|| normalized.startsWith("DROP PROPERTY")
|| normalized.startsWith("CREATE INDEX")
|| normalized.startsWith("DROP INDEX")
|| normalized.startsWith("CREATE VERTEX")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not schema command but DML command

|| normalized.startsWith("CREATE INDEX")
|| normalized.startsWith("DROP INDEX")
|| normalized.startsWith("CREATE VERTEX")
|| normalized.startsWith("CREATE EDGE");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not DDL command

if ("COMMIT".equals(normalized)) {
if (session.isTxActive()) {
session.commit();
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should throw an exception if we commit TX but it does not exist.

try (var session = acquireSession()) {
session.command(command, params);
var normalized = command == null ? "" : command.trim().toUpperCase();
if ("BEGIN".equals(normalized)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This command does not have sense as closing session will cause to rollback tx

@github-actions
Copy link

github-actions bot commented Jan 29, 2026

Qodana for JVM

It seems all right 👌

No new problems were found according to the checks applied

💡 Qodana analysis was run in the pull request mode: only the changed files were checked
☁️ View the detailed Qodana report

Contact Qodana team

Contact us at [email protected]

BOOL: T R U E | F A L S E;
DOT : '.' ;
DASH: '-';
ID: [a-zA-Z_][a-zA-Z_0-9]* ;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use different ID syntax, namely #d+:d+, we should change it, to properly support value expressions.

comparison_operator: EQ | NEQ | GT | GTE | LT | LTE | IN;
sub: DASH;
numeric_literal: (sub)? (NUMBER | INT);
property_reference : ID (DOT ID)* ;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With such defenition of ID as I mentioned, that would be incorrect.


private static boolean isSchemaCommand(String normalized) {
return normalized.startsWith("CREATE CLASS")
|| normalized.startsWith("DROP CLASS")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alter statements are missed, like ALTER INDEX for example. Please relace manual detection by calling of SQL parser and the check instanceof DDLSStatement

&& !argsList.isEmpty()) {
if (argsList.getFirst() instanceof String cmd) {
finalCommand = cmd;
if (argsList.size() > 1) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no check that arguments are provided in pairs.

return;
}
case "ROLLBACK" -> {
var tx = tx();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no check that TX is present it should be the same as for commit.

var path = Paths.get(pathStr).toAbsolutePath();
if (!Files.exists(path)) {
System.err.println("WARNING: Directory does not exist: " + path);
return Stream.empty();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix Gemini observation

}

static Stream<Path> getPositiveGqlFiles() throws IOException {
return getFilesFromPath("src/test/resources/gql-tests/positive");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use getClass().getResource() to avoid fix of realtive path to the current directory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants