-
Notifications
You must be signed in to change notification settings - Fork 3.7k
feat: Implement date_part scalar function
#27005
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master-1.x
Are you sure you want to change the base?
Conversation
Remove a lot of code that wasn't needed for date_part including iterator creation. We can just map values similar to simple math functions.
|
|
||
| const ( | ||
| DatePartString = "date_part" | ||
| DatePartTimeString = "date_part_time" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is date_part_time? I don't see any tests for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used to create a reference to time since time is an auxiliary field https://github.com/influxdata/influxdb/pull/27005/files#diff-609a7e16be956ed6386e1a4a4efadf600b7d4de7dcfea27330dc692d1e901dc8R930-R944 I'm going to create some ValueMapper tests for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gwossum I can add tests for this but it would likely require exporting
Line 881 in 362217b
| type valueMapper struct { |
davidby-influx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some changes from the first pass. Will review again after changes.
| // Multiple selectors WITH date_part should also error | ||
| {s: `SELECT value, first(value), last(value), date_part('dow', time) FROM cpu`, err: `mixing multiple selector functions with tags or fields is not supported`}, | ||
| // date_part subquery validation - cannot be sole field | ||
| {s: `SELECT date_part('dow', value) FROM (SELECT value FROM cpu)`, err: `date_part: second argument must be time VarRef`}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the spec, it says:
expression: Time expression to operate on. Can be a constant, column, or function.
Here it looks like you are prohibiting anything not the time column. Which is correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to modify our version of the spec. I think it should only operate on the time column. For v3 they can decide if they would like to add more features and allow it to work on constants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just changed the spec in the PR description to match: #27001 I recall we had a conversation yesterday during standup about the usefulness of constant timestamps. I verified that for other functions in influxdb 1.x (math functions predominately) we don't allow constants, so implementing a constant here would be non-trivial.
davidby-influx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unify pseudo-constants
davidby-influx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not clear on how one part works, but if you have any other changes, also add an explanatory comment there, Not blocking.
| DatePartTimeString = "date_part_time" | ||
|
|
||
| // DatePartArgCount is the amount of arguments required for date_part function | ||
| DatePartArgCount = 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice
|
|
||
| exprStr, ok := args[0].(*influxql.StringLiteral) | ||
| if !ok { | ||
| return errors.New("date_part: first argument must be a string") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a blocker, but this could be a fmt.Errorf with a %T to print the type of arg[0] to help debug. See how that looks before implementing; it may not be useful or readable.
fmt.Errorf("date_part: first argument must be a string literal, not a %T", arg[0])
davidby-influx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some refactoring, and a question about which IteratorOptions we should be looking at.
davidby-influx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Does date_part work in a GROUP BY clause? That's a common usage in SQL (e.g., are Mondays busier than Wednesdays?)
It does not currently. Since we only support I would expect to spend a few more days to get it working. |
|
I think it would be important to spend the time making SELECT
mean(temp)
WHERE
time >= '2025-01-01T00:00:00Z'
time < '2026-01-01T00:00:00Z'
GROUP BY
date_part('month', time),
date_part('year', time) AS year,
locationOR SELECT
date_part('month', time) AS month,
date_part('year', time) AS year,
location,
mean(temp)
WHERE
time >= '2025-01-01T00:00:00Z'
time < '2026-01-01T00:00:00Z'
GROUP BY
month,
year,
locationOr something similar. I realize these both somewhat break InfluxQL conventions, but I think filtering and grouping by these values are the two most common use cases for this function. This will also make filtering/grouping by calendar units (week/month/year/etc.) possible in InfluxQL; things that have been requested for a long time. |
|
@sanderson I agree that it would be useful in the GROUP BY clause. |
Implement
date_part(part, expression)Arguments
Example usage:
Please see #27001 (comment) for additional information on date_part limitations in 1.x.