-
Notifications
You must be signed in to change notification settings - Fork 414
[doc] add short_circuit exec doc #3326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Mryange
wants to merge
1
commit into
apache:master
Choose a base branch
from
Mryange:short_circuit-exec-doc
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
88 changes: 88 additions & 0 deletions
88
docs/sql-manual/sql-functions/scalar-functions/conditional-functions/overview.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,88 @@ | ||
| --- | ||
| { | ||
| "title": "Conditional Functions Overview", | ||
| "language": "en", | ||
| "description": "Conditional functions are built-in functions used to perform conditional logic and branching in SQL queries." | ||
| } | ||
| --- | ||
|
|
||
| # Conditional Functions Overview | ||
|
|
||
| Conditional functions are built-in functions used to perform conditional logic and branching in SQL queries. They help execute different operations based on specified conditions, such as selecting values, handling NULL values, and performing case-based logic. | ||
|
|
||
| ## Vectorized Execution and Conditional Functions | ||
|
|
||
| Doris is a vectorized execution engine. However, conditional functions may behave in ways that seem counterintuitive. | ||
|
|
||
| Consider the following example: | ||
|
|
||
| ```sql | ||
| mysql> set enable_strict_cast = true; | ||
| Query OK, 0 rows affected (0.00 sec) | ||
|
|
||
| mysql> select count( | ||
| -> if(number < 128 , | ||
| -> cast(number as tinyint), | ||
| -> cast(number as String)) | ||
| -> ) from numbers("number" = "300"); | ||
| ERROR 1105 (HY000): errCode = 2, detailMessage = (127.0.0.1)[INVALID_ARGUMENT]Value 128 out of range for type tinyint | ||
| ``` | ||
|
|
||
| In this example, even though we only cast to `tinyint` when `number < 128` in the `if` function, an error still occurs. This is because of how conditional functions like `if(cond, colA, colB)` were traditionally executed: | ||
|
|
||
| 1. First, both `colA` and `colB` are fully computed | ||
| 2. Then, based on the value of `cond`, the corresponding result is selected and returned | ||
|
|
||
| So even if `colA`'s value is not actually used in practice, since `colA` is fully computed, it will still trigger an error. | ||
|
|
||
| Functions like `if`, `ifnull`, `case`, and `coalesce` have similar behavior. | ||
|
|
||
| Note that functions like `LEAST` do not have this issue because they inherently need to compute all parameters to compare values. | ||
|
|
||
| ## Short-Circuit Evaluation | ||
|
|
||
| In Doris 4.0, we improved the execution logic of conditional functions to allow short-circuit evaluation. | ||
|
|
||
| ```sql | ||
| mysql> set short_circuit_evaluation = true; | ||
| Query OK, 0 rows affected (0.00 sec) | ||
|
|
||
| mysql> select count( | ||
| -> if(number < 128 , | ||
| -> cast(number as tinyint), | ||
| -> cast(number as String)) | ||
| -> ) from numbers("number" = "300"); | ||
| +-------------------------------------------------------------------------+ | ||
| | count(if(number < 128, cast(number as tinyint), cast(number as String)))| | ||
| +-------------------------------------------------------------------------+ | ||
| | 300 | | ||
| +-------------------------------------------------------------------------+ | ||
| ``` | ||
|
|
||
| With short-circuit evaluation enabled, functions like `if`, `ifnull`, `case`, and `coalesce` can avoid unnecessary computations in many scenarios, thus preventing errors and improving performance. | ||
|
|
||
| ### Enabling Short-Circuit Evaluation | ||
|
|
||
| To enable short-circuit evaluation, set the session variable: | ||
|
|
||
| ```sql | ||
| SET short_circuit_evaluation = true; | ||
| ``` | ||
|
|
||
| ### Benefits of Short-Circuit Evaluation | ||
|
|
||
| 1. **Error Prevention**: Avoids executing branches that would cause errors when conditions exclude them | ||
| 2. **Performance Improvement**: Reduces unnecessary computations by only evaluating branches that are actually needed | ||
| 3. **More Intuitive Behavior**: Makes conditional functions behave more like traditional programming language conditionals | ||
|
|
||
| ## Common Conditional Functions | ||
|
|
||
| Common conditional functions that benefit from short-circuit evaluation include: | ||
|
|
||
| - `IF`: Returns one of two values based on a condition | ||
| - `IFNULL`: Returns the first argument if it's not NULL, otherwise returns the second argument | ||
| - `CASE`: Provides multiple conditional branches similar to switch-case statements | ||
| - `COALESCE`: Returns the first non-NULL value from a list of arguments | ||
| - `NULLIF`: Returns NULL if two arguments are equal, otherwise returns the first argument | ||
|
|
||
| For detailed information about each function, please refer to their respective documentation pages. |
88 changes: 88 additions & 0 deletions
88
...ent/sql-manual/sql-functions/scalar-functions/conditional-functions/overview.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,88 @@ | ||
| --- | ||
| { | ||
| "title": "条件函数概述", | ||
| "language": "zh-CN", | ||
| "description": "条件函数是用于在 SQL 查询中执行条件逻辑和分支的内置函数。" | ||
| } | ||
| --- | ||
|
|
||
| # 条件函数概述 | ||
|
|
||
| 条件函数是用于在 SQL 查询中执行条件逻辑和分支的内置函数。它们帮助根据指定的条件执行不同的操作,例如选择值、处理 NULL 值以及执行基于条件的逻辑判断。 | ||
|
|
||
| ## 向量化执行与条件函数 | ||
|
|
||
| Doris 是向量化执行的引擎。但是对于条件函数,可能会有一些反直觉的地方。 | ||
|
|
||
| 考虑以下示例: | ||
|
|
||
| ```sql | ||
| mysql> set enable_strict_cast = true; | ||
| Query OK, 0 rows affected (0.00 sec) | ||
|
|
||
| mysql> select count( | ||
| -> if(number < 128 , | ||
| -> cast(number as tinyint), | ||
| -> cast(number as String)) | ||
| -> ) from numbers("number" = "300"); | ||
| ERROR 1105 (HY000): errCode = 2, detailMessage = (127.0.0.1)[INVALID_ARGUMENT]Value 128 out of range for type tinyint | ||
| ``` | ||
|
|
||
| 上面的例子中,虽然我们在 `if` 函数中,`number < 128` 的分支才会被转换为 `tinyint` 类型,但是还是报错了。这是因为对于 `if(cond, colA, colB)` 这个条件函数,传统的执行方式是: | ||
|
|
||
| 1. 先完整计算 `colA` 和 `colB` | ||
| 2. 然后根据 `cond` 的值,选择对应的结果返回 | ||
|
|
||
| 所以即使在实际执行中,并没有用到 `colA` 的值,但是因为 `colA` 被完整计算了,所以会报错。 | ||
|
|
||
| `if`、`ifnull`、`case`、`coalesce` 等函数都有类似的问题。 | ||
|
|
||
| 注意,例如 `LEAST` 这样的函数是没有这样的问题的,因为它本身就需要把所有的参数都计算出来,才能比较大小。 | ||
|
|
||
| ## 短路执行 | ||
|
|
||
| 在 Doris 4.0 版本中,我们对条件函数的执行逻辑进行了改进,允许短路执行。 | ||
|
|
||
| ```sql | ||
| mysql> set short_circuit_evaluation = true; | ||
| Query OK, 0 rows affected (0.00 sec) | ||
|
|
||
| mysql> select count( | ||
| -> if(number < 128 , | ||
| -> cast(number as tinyint), | ||
| -> cast(number as String)) | ||
| -> ) from numbers("number" = "300"); | ||
| +-------------------------------------------------------------------------+ | ||
| | count(if(number < 128, cast(number as tinyint), cast(number as String)))| | ||
| +-------------------------------------------------------------------------+ | ||
| | 300 | | ||
| +-------------------------------------------------------------------------+ | ||
| ``` | ||
|
|
||
| 开启短路执行后,`if`、`ifnull`、`case`、`coalesce` 等函数在很多场景下可以避免不必要的计算,从而避免报错并提升性能。 | ||
|
|
||
| ### 开启短路执行 | ||
|
|
||
| 要开启短路执行,需要设置会话变量: | ||
|
|
||
| ```sql | ||
| SET short_circuit_evaluation = true; | ||
| ``` | ||
|
|
||
| ### 短路执行的优势 | ||
|
|
||
| 1. **避免错误**:当条件排除某些分支时,避免执行会导致错误的分支 | ||
| 2. **性能提升**:只计算实际需要的分支,减少不必要的计算 | ||
| 3. **更直观的行为**:使条件函数的行为更接近传统编程语言中的条件语句 | ||
|
|
||
| ## 常见条件函数 | ||
|
|
||
| 受益于短路执行的常见条件函数包括: | ||
|
|
||
| - `IF`:根据条件返回两个值中的一个 | ||
| - `IFNULL`:如果第一个参数不为 NULL 则返回第一个参数,否则返回第二个参数 | ||
| - `CASE`:提供多个条件分支,类似于 switch-case 语句 | ||
| - `COALESCE`:从参数列表中返回第一个非 NULL 的值 | ||
| - `NULLIF`:如果两个参数相等则返回 NULL,否则返回第一个参数 | ||
|
|
||
| 有关每个函数的详细信息,请参阅各自的文档页面。 |
88 changes: 88 additions & 0 deletions
88
...4.x/sql-manual/sql-functions/scalar-functions/conditional-functions/overview.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,88 @@ | ||
| --- | ||
| { | ||
| "title": "条件函数概述", | ||
| "language": "zh-CN", | ||
| "description": "条件函数是用于在 SQL 查询中执行条件逻辑和分支的内置函数。" | ||
| } | ||
| --- | ||
|
|
||
| # 条件函数概述 | ||
|
|
||
| 条件函数是用于在 SQL 查询中执行条件逻辑和分支的内置函数。它们帮助根据指定的条件执行不同的操作,例如选择值、处理 NULL 值以及执行基于条件的逻辑判断。 | ||
|
|
||
| ## 向量化执行与条件函数 | ||
|
|
||
| Doris 是向量化执行的引擎。但是对于条件函数,可能会有一些反直觉的地方。 | ||
|
|
||
| 考虑以下示例: | ||
|
|
||
| ```sql | ||
| mysql> set enable_strict_cast = true; | ||
| Query OK, 0 rows affected (0.00 sec) | ||
|
|
||
| mysql> select count( | ||
| -> if(number < 128 , | ||
| -> cast(number as tinyint), | ||
| -> cast(number as String)) | ||
| -> ) from numbers("number" = "300"); | ||
| ERROR 1105 (HY000): errCode = 2, detailMessage = (127.0.0.1)[INVALID_ARGUMENT]Value 128 out of range for type tinyint | ||
| ``` | ||
|
|
||
| 上面的例子中,虽然我们在 `if` 函数中,`number < 128` 的分支才会被转换为 `tinyint` 类型,但是还是报错了。这是因为对于 `if(cond, colA, colB)` 这个条件函数,传统的执行方式是: | ||
|
|
||
| 1. 先完整计算 `colA` 和 `colB` | ||
| 2. 然后根据 `cond` 的值,选择对应的结果返回 | ||
|
|
||
| 所以即使在实际执行中,并没有用到 `colA` 的值,但是因为 `colA` 被完整计算了,所以会报错。 | ||
|
|
||
| `if`、`ifnull`、`case`、`coalesce` 等函数都有类似的问题。 | ||
|
|
||
| 注意,例如 `LEAST` 这样的函数是没有这样的问题的,因为它本身就需要把所有的参数都计算出来,才能比较大小。 | ||
|
|
||
| ## 短路执行 | ||
|
|
||
| 在 Doris 4.0.3 版本中,我们对条件函数的执行逻辑进行了改进,允许短路执行。 | ||
|
|
||
| ```sql | ||
| mysql> set short_circuit_evaluation = true; | ||
| Query OK, 0 rows affected (0.00 sec) | ||
|
|
||
| mysql> select count( | ||
| -> if(number < 128 , | ||
| -> cast(number as tinyint), | ||
| -> cast(number as String)) | ||
| -> ) from numbers("number" = "300"); | ||
| +-------------------------------------------------------------------------+ | ||
| | count(if(number < 128, cast(number as tinyint), cast(number as String)))| | ||
| +-------------------------------------------------------------------------+ | ||
| | 300 | | ||
| +-------------------------------------------------------------------------+ | ||
| ``` | ||
|
|
||
| 开启短路执行后,`if`、`ifnull`、`case`、`coalesce` 等函数在很多场景下可以避免不必要的计算,从而避免报错并提升性能。 | ||
|
|
||
| ### 开启短路执行 | ||
|
|
||
| 要开启短路执行,需要设置会话变量: | ||
|
|
||
| ```sql | ||
| SET short_circuit_evaluation = true; | ||
| ``` | ||
|
|
||
| ### 短路执行的优势 | ||
|
|
||
| 1. **避免错误**:当条件排除某些分支时,避免执行会导致错误的分支 | ||
| 2. **性能提升**:只计算实际需要的分支,减少不必要的计算 | ||
| 3. **更直观的行为**:使条件函数的行为更接近传统编程语言中的条件语句 | ||
|
|
||
| ## 常见条件函数 | ||
|
|
||
| 受益于短路执行的常见条件函数包括: | ||
|
|
||
| - `IF`:根据条件返回两个值中的一个 | ||
| - `IFNULL`:如果第一个参数不为 NULL 则返回第一个参数,否则返回第二个参数 | ||
| - `CASE`:提供多个条件分支,类似于 switch-case 语句 | ||
| - `COALESCE`:从参数列表中返回第一个非 NULL 的值 | ||
| - `NULLIF`:如果两个参数相等则返回 NULL,否则返回第一个参数 | ||
|
|
||
| 有关每个函数的详细信息,请参阅各自的文档页面。 |
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 4.0 sidebar? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
88 changes: 88 additions & 0 deletions
88
...4.x/sql-manual/sql-functions/scalar-functions/conditional-functions/overview.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,88 @@ | ||
| --- | ||
| { | ||
| "title": "Conditional Functions Overview", | ||
| "language": "en", | ||
| "description": "Conditional functions are built-in functions used to perform conditional logic and branching in SQL queries." | ||
| } | ||
| --- | ||
|
|
||
| # Conditional Functions Overview | ||
|
|
||
| Conditional functions are built-in functions used to perform conditional logic and branching in SQL queries. They help execute different operations based on specified conditions, such as selecting values, handling NULL values, and performing case-based logic. | ||
|
|
||
| ## Vectorized Execution and Conditional Functions | ||
|
|
||
| Doris is a vectorized execution engine. However, conditional functions may behave in ways that seem counterintuitive. | ||
|
|
||
| Consider the following example: | ||
|
|
||
| ```sql | ||
| mysql> set enable_strict_cast = true; | ||
| Query OK, 0 rows affected (0.00 sec) | ||
|
|
||
| mysql> select count( | ||
| -> if(number < 128 , | ||
| -> cast(number as tinyint), | ||
| -> cast(number as String)) | ||
| -> ) from numbers("number" = "300"); | ||
| ERROR 1105 (HY000): errCode = 2, detailMessage = (127.0.0.1)[INVALID_ARGUMENT]Value 128 out of range for type tinyint | ||
| ``` | ||
|
|
||
| In this example, even though we only cast to `tinyint` when `number < 128` in the `if` function, an error still occurs. This is because of how conditional functions like `if(cond, colA, colB)` were traditionally executed: | ||
|
|
||
| 1. First, both `colA` and `colB` are fully computed | ||
| 2. Then, based on the value of `cond`, the corresponding result is selected and returned | ||
|
|
||
| So even if `colA`'s value is not actually used in practice, since `colA` is fully computed, it will still trigger an error. | ||
|
|
||
| Functions like `if`, `ifnull`, `case`, and `coalesce` have similar behavior. | ||
|
|
||
| Note that functions like `LEAST` do not have this issue because they inherently need to compute all parameters to compare values. | ||
|
|
||
| ## Short-Circuit Evaluation | ||
|
|
||
| In Doris 4.0.3, we improved the execution logic of conditional functions to allow short-circuit evaluation. | ||
|
|
||
| ```sql | ||
| mysql> set short_circuit_evaluation = true; | ||
| Query OK, 0 rows affected (0.00 sec) | ||
|
|
||
| mysql> select count( | ||
| -> if(number < 128 , | ||
| -> cast(number as tinyint), | ||
| -> cast(number as String)) | ||
| -> ) from numbers("number" = "300"); | ||
| +-------------------------------------------------------------------------+ | ||
| | count(if(number < 128, cast(number as tinyint), cast(number as String)))| | ||
| +-------------------------------------------------------------------------+ | ||
| | 300 | | ||
| +-------------------------------------------------------------------------+ | ||
| ``` | ||
|
|
||
| With short-circuit evaluation enabled, functions like `if`, `ifnull`, `case`, and `coalesce` can avoid unnecessary computations in many scenarios, thus preventing errors and improving performance. | ||
|
|
||
| ### Enabling Short-Circuit Evaluation | ||
|
|
||
| To enable short-circuit evaluation, set the session variable: | ||
|
|
||
| ```sql | ||
| SET short_circuit_evaluation = true; | ||
| ``` | ||
|
|
||
| ### Benefits of Short-Circuit Evaluation | ||
|
|
||
| 1. **Error Prevention**: Avoids executing branches that would cause errors when conditions exclude them | ||
| 2. **Performance Improvement**: Reduces unnecessary computations by only evaluating branches that are actually needed | ||
| 3. **More Intuitive Behavior**: Makes conditional functions behave more like traditional programming language conditionals | ||
|
|
||
| ## Common Conditional Functions | ||
|
|
||
| Common conditional functions that benefit from short-circuit evaluation include: | ||
|
|
||
| - `IF`: Returns one of two values based on a condition | ||
| - `IFNULL`: Returns the first argument if it's not NULL, otherwise returns the second argument | ||
| - `CASE`: Provides multiple conditional branches similar to switch-case statements | ||
| - `COALESCE`: Returns the first non-NULL value from a list of arguments | ||
| - `NULLIF`: Returns NULL if two arguments are equal, otherwise returns the first argument | ||
|
|
||
| For detailed information about each function, please refer to their respective documentation pages. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
非dev文档要注明具体引入的小版本