-
Notifications
You must be signed in to change notification settings - Fork 19
[draft]Use Apache Doris as the data connection source #1460
base: main
Are you sure you want to change the base?
Conversation
jakozaur
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing proof of concept. Thank you for starting it.
Though to merge it to main we would probably need:
- Work on new backend connector. Start by copying
platform/clickhousetoplatform/dorisand make changes there, while avoid changes that would break ClickHouse. - That would probably need also some changes to design.
- Similarly with SQL function changes. Ideally we should build abstract syntax tree and render the SQL dialects in
expr_string_renderer.go - Make sure you will be able to sign CLA.
| } | ||
|
|
||
| func (p *BasicSqlBackendConnector) Exec(ctx context.Context, query string, args ...interface{}) error { | ||
| logger.Info().Msgf("Exec sql: %s", query) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All three above make a huge sense for debugging. You may also leverage our internal debug console http://localhost:9999.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx
592a6f8 to
357342d
Compare
| "strings" | ||
| ) | ||
|
|
||
| type SchemaTypeAdapter struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typeAdapter is already an interface https://github.com/QuesmaOrg/quesma/blob/main/platform/schema/registry.go#L54
I would suggest to implement it and then pass to NewSchemaRegistry https://github.com/QuesmaOrg/quesma/blob/main/cmd/main.go#L94
| package clickhouse | ||
|
|
||
| import ( | ||
| "crypto/tls" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to create a new directory apache_doris and implement below functionality there. Then proper initDBConnection can be invoked here https://github.com/QuesmaOrg/quesma/blob/main/cmd/main.go#L87
|
Hey @pdelewski , the two problems you mentioned above are not difficult. |
@DongLiang-0 True, but there are at least few options to solve this problem. I will add some comments in different places. |
| origFunc := origExprTyped | ||
| switch origFunc.Name { | ||
| case "sum", "sumOrNull", "min", "minOrNull", "max", "maxOrNull": | ||
| case "sum", "sumOrNull", "min", "max": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to remove minOrNull and maxOrNull ?. I think that it should be orthogonal
| return model.NewFunction("APPROX_COUNT_DISTINCT", model.NewColumnRef(columnName)), "", nil | ||
| } | ||
|
|
||
| if strings.HasPrefix(origFunc.Name, "quantiles") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, Do we need to remove that?
| } | ||
|
|
||
| func (p *pancakeSqlQueryGenerator) generatePartitionBy(groupByColumns []model.AliasedExpr) []model.Expr { | ||
| func (p *pancakeSqlQueryGenerator) generatePartitionBy(groupByColumns []model.GroupByExpr, useGroupByColumn bool) []model.Expr { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to rename old function to something like generatePartitionByViaAliasExpr and add a new function generatePartitionByViaGroupByExpr
| } | ||
|
|
||
| func (p *pancakeSqlQueryGenerator) generateMetricSelects(metric *pancakeModelMetricAggregation, groupByColumns []model.AliasedExpr, hasMoreBucketAggregations bool) (addSelectColumns []model.AliasedExpr, err error) { | ||
| func (p *pancakeSqlQueryGenerator) generateMetricSelects(metric *pancakeModelMetricAggregation, groupByColumns []model.GroupByExpr, hasMoreBucketAggregations bool) (addSelectColumns []model.AliasedExpr, err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above. I think we can have two functions
| } | ||
|
|
||
| func (p *pancakeSqlQueryGenerator) isPartOf(column model.Expr, aliasedColumns []model.AliasedExpr) *model.AliasedExpr { | ||
| func (p *pancakeSqlQueryGenerator) isPartOf(column model.Expr, aliasedColumns []model.GroupByExpr) *model.AliasedExpr { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
| } | ||
|
|
||
| func (p *pancakeSqlQueryGenerator) isPartOfOrderBy(alias model.AliasedExpr, orderByColumns []model.OrderByExpr) bool { | ||
| func (p *pancakeSqlQueryGenerator) isPartOfOrderBy(alias model.GroupByExpr, orderByColumns []model.OrderByExpr) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
| } | ||
|
|
||
| func (p *pancakeSqlQueryGenerator) addPotentialParentCount(bucketAggregation *pancakeModelBucketAggregation, groupByColumns []model.AliasedExpr) []model.AliasedExpr { | ||
| func (p *pancakeSqlQueryGenerator) addPotentialParentCount(bucketAggregation *pancakeModelBucketAggregation, groupByColumns []model.GroupByExpr) []model.AliasedExpr { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
| return v.VisitTableRef(t) | ||
| } | ||
|
|
||
| type GroupByExpr struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the difference between the new GroupByExpr and the existing AliasExpr types — their contents appear to be the same. Introducing this new type triggers many changes in function signatures, and I'm not sure why it's needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In doris syntax, cannot use an alias after group by. For example, cannot writegroup by `col1` as `a`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that, however now GroupByExpr and AliasExpr are the same. Do you plan to change that?
|
|
||
| const array = "array" | ||
|
|
||
| type functionWithCombinator struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think functionWithCombinator can be parametrized with array prefix
| matches := wantedRegex.FindStringSubmatch(source) | ||
| if len(matches) == 2 { | ||
| return model.NewFunction("toHour", model.NewColumnRef(matches[1])), true | ||
| return model.NewFunction("HOUR", model.NewColumnRef(matches[1])), true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be abstract name, that will be later resolved by additional transformation depending on backend
|
Overall, great work — thank you for your contribution! A few comments:
|
A draft PR using quesma to use doris as the data connection source.
This connection source is modified based on clickhouse, and the discover page can be used normally for the kibana_sample_data_logs table.