-
Notifications
You must be signed in to change notification settings - Fork 642
CASSGO-43: externally-defined type registration #1855
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Here's an example of a JSONB type: package mypackage
import "github.com/gocql/gocql"
var typeJSONB gocql.Type = 0x0080
func init() {
gocql.RegisterType(typeJSONB, "jsonb", gocql.SimpleCQLType{jsonbCQLType{}})
}
// jsonbCQLType implements the gocql.CQLType interface
type jsonbCQLType struct {}
// Type implements the gocql.CQLType interface
func (jsonbCQLType) Type() gocql.Type {
return typeJSONB
}
// Zero implements the gocql.CQLType interface
func (jsonbCQLType) Zero() interface{} {
return []byte(nil)
}
// Marshal implements the gocql.CQLType interface
func (j jsonbCQLType) Marshal(value interface{}) ([]byte, error) {
switch v := value.(type) {
case json.RawMessage:
return v, nil
case string:
return []byte(v), nil
case []byte:
return v, nil
}
if value == nil {
return nil, nil
}
rv := reflect.ValueOf(value)
t := rv.Type()
k := t.Kind()
switch {
case k == reflect.String:
return []byte(rv.String()), nil
case k == reflect.Slice && t.Elem().Kind() == reflect.Uint8:
return rv.Bytes(), nil
}
return nil, gocql.MarshalError(fmt.Sprintf("can not marshal %T into jsonb", value))
}
// Unmarshal implements the gocql.CQLType interface
func (j jsonbCQLType) Unmarshal(data []byte, value interface{}) error {
switch v := value.(type) {
case *json.RawMessage:
if data != nil {
*v = append((*v)[:0], data...)
} else {
*v = nil
}
return nil
case *string:
*v = string(data)
return nil
case *[]byte:
if data != nil {
*v = append((*v)[:0], data...)
} else {
*v = nil
}
return nil
case *interface{}:
if data != nil {
*v = append(json.RawMessage{}, data...)
} else {
*v = nil
}
return nil
}
rv := reflect.ValueOf(value)
if rv.Kind() != reflect.Ptr {
return gocql.UnmarshalError(fmt.Sprintf("can not unmarshal jsonb into non-pointer %T", value))
}
rv = rv.Elem()
t := rv.Type()
k := t.Kind()
switch {
case k == reflect.String:
rv.SetString(string(data))
return nil
case k == reflect.Slice && t.Elem().Kind() == reflect.Uint8:
var dataCopy []byte
if data != nil {
dataCopy = make([]byte, len(data))
copy(dataCopy, data)
}
rv.SetBytes(dataCopy)
return nil
}
return gocql.UnmarshalError(fmt.Sprintf("can not unmarshal jsonb into %T", value))
}Or if they don't care about the var typeJSONB gocql.Type = 0x0080
func init() {
gocql.RegisterAlias(typeJSONB, "blob")
} |
6b5029b to
71ccb42
Compare
61387c3 to
0abeec5
Compare
0abeec5 to
963a6f6
Compare
|
I could use some tests of custom types but given that all of the native types are using most of the same code, I feel confident in the guts of the code. |
1c53a5b to
0203176
Compare
That makes sense. Going to switch this back to draft so I can resolve those conflicts and take a stab at cleaning up the UDT case we talked about in Slack. |
|
I simplified the |
0203176 to
6b5029b
Compare
|
Initially, some of the unmarshal benchmarks were worse so I inlined some of the type switching which is a bit gross but massively improved the benchmarks. |
88ed1af to
33fab21
Compare
I do think we can roll some of those back when Go 1.24 comes out and there's improved Swiss table maps. I'll need to rerun the benchmarks but it would be nice to get rid of all of that boilerplate. |
We probably want to be careful with requiring such a recent go version, if it's not crucial then we should delay it as much as possible |
|
Is this ready for review btw? |
Yes, but I would like a chance to write some more tests. I think it's in a good place though for review besides the tests.
Yeah, my comment was meant for the distant future. |
|
I'm much more in favor of this new path. I think for the "merging" we should probably just go for a simple approach where the global registrations just aren't used at all if the user provides a struct in the cluster config. Or you overwrite the global registrations by checking them one by one and you log a warning when an overwrite occurs. Are you completely against adding the protocol version as a parameter to the |
I split out the |
f0cded4 to
1482d14
Compare
I was mostly thinking about future protocol versions that might change the format of existing types similar to what was done to collections in v3 but IIUC we will be able to do it via Also there's #1890 and I'm probably going to merge it after this one which will remove the need to check the protocol version even for collections |
|
I can't find the thread discussing the error vs panic but I got a notification of it so I'll reply here.
I think changing it to |
1482d14 to
8ebe6e7
Compare
Yeah, I wouldn't say it's perfect but given that now nothing needs to know the protocol version I think it's a good tradeoff between breaking the existing API and it still being possible.
Yeah I went back and forth on it but ended with giving the option of getting an error if you wanted one. |
joao-r-reis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a few comments. Note that some comments from my previous review are still relevant and the threads are still open.
As mentioned in a separate comment, UDT support looks very basic and not really useful for now, users still need to implement the old udt marshaler interface to do anything useful with UDTs if I'm understanding correctly. Due to this I think we should limit what's exported in this API so that we don't leak stuff related to UDTs as this is very likely to change when we add proper UDT support in the future.
marshal.go
Outdated
| } | ||
| var key interface{} | ||
| // this relies on Unmarshal marshalling default values when presented with nil | ||
| if err := Unmarshal(c.Key, []byte(nil), &key); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks a bit odd to me, in my mind nil should result in a nil value as output from unmarshal but we don't need to invoke the unmarshal function to do it so it could just be a part of the "contract" of this interface that you must return the zero value / default as output if nil is the input. On the other hand having an explicit NewDefault or NewZeroValue or something like this would make it more clear and we could just invoke that method here and in RowData() and other places that rely on getting zero values from types.
db93d67 to
39ae518
Compare
|
I think I addressed all outstanding comments and removed most of the panic calls. I also managed to move around some things and speed up almost every single benchmark. I'll give it another once over tomorrow. |
|
@joao-r-reis I just realized the Unmarshal docs say:
We might want to clarify that because that could be read as the "zero value" for the column and not the "zero value" for the Go type. Thoughts? |
39ae518 to
9404721
Compare
marshal.go
Outdated
| valueRef.Elem().Set(reflect.Zero(valueElemRef.Type())) | ||
| return nil | ||
| } | ||
| // TODO: can we do this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joao-r-reis we technically don't need to be allocating a new value if the existing pointer is non-nil but this would be considered a breaking change. The original commit doesn't mention this as a goal and there was a related issue filed regarding reuse of byte slices that was closed as working-as-intended. json's unmarshaller does not allocate a new value. But I also found at least one case of someone in another driver saying this is confusing. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
discussed offline but yeah I think we should probably not change that behavior just to be safe
cd36ff2 to
5533055
Compare
|
I made the metadata methods not error on unknown types and instead return the unknownTypeInfo since it didn't feel right to be erroring just because some return type on a function isn't known. I also changed |
5533055 to
83550bc
Compare
joao-r-reis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing work, this is a significant improvement of both internal and public APIs related to marshaling and CQL types.
| writeTimeout = cfg.WriteTimeout | ||
| } | ||
|
|
||
| logger := cfg.Logger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not cfg.logger()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made it fallback to the session's logger if the connection didn't have one. If I called logger() it would set a default logger without looking at the session.
The new RegisteredTypes struct can be used to register externally-defined
types. You'll need to define your own marshalling and unmarshalling code
as well as a TypeInfo implementation. The name and id MUST not collide
with existing and future native CQL types.
A lot of the type handling was refactored to use the new format for
types.
inet columns are now unmarshaled as net.IP which is a breaking change.
goos: linux
goarch: amd64
pkg: github.com/gocql/gocql
cpu: AMD EPYC 7B13
│ old.txt │ new3.txt │
│ sec/op │ sec/op vs base │
SingleConn-16 25.87µ ± 2% 26.49µ ± 1% +2.40% (p=0.000 n=10)
ParseRowsFrame-16 870.5n ± 3% 683.1n ± 4% -21.52% (p=0.000 n=10)
UnmarshalVarchar-16 47.29n ± 2%
UnmarshalUUID-16 3.161n ± 2%
Unmarshal_BigInt-16 17.61n ± 9% 13.47n ± 0% -23.48% (p=0.000 n=10)
Unmarshal_Blob-16 19.19n ± 0% 14.79n ± 2% -22.88% (p=0.000 n=10)
Unmarshal_Boolean-16 15.71n ± 0% 12.20n ± 1% -22.32% (p=0.000 n=10)
Unmarshal_Date-16 18.53n ± 1% 15.98n ± 0% -13.73% (p=0.000 n=10)
Unmarshal_Decimal-16 172.15n ± 2% 96.55n ± 1% -43.92% (p=0.000 n=10)
Unmarshal_Double-16 16.33n ± 1% 12.21n ± 0% -25.20% (p=0.000 n=10)
Unmarshal_Duration-16 26.03n ± 1% 22.30n ± 1% -14.33% (p=0.000 n=10)
Unmarshal_Float-16 15.70n ± 0% 12.21n ± 0% -22.17% (p=0.000 n=10)
Unmarshal_Int-16 17.80n ± 1% 14.43n ± 0% -18.96% (p=0.000 n=10)
Unmarshal_Inet-16 28.68n ± 1% 25.25n ± 1% -11.96% (p=0.000 n=10)
Unmarshal_SmallInt-16 17.92n ± 1% 14.45n ± 1% -19.37% (p=0.000 n=10)
Unmarshal_Time-16 16.00n ± 1% 12.25n ± 3% -23.47% (p=0.000 n=10)
Unmarshal_Timestamp-16 16.00n ± 2% 12.23n ± 0% -23.62% (p=0.000 n=10)
Unmarshal_TinyInt-16 16.38n ± 3% 14.40n ± 1% -12.09% (p=0.000 n=10)
Unmarshal_UUID-16 16.05n ± 1% 12.56n ± 1% -21.77% (p=0.000 n=10)
Unmarshal_Varchar-16 18.86n ± 1% 14.81n ± 1% -21.43% (p=0.000 n=10)
Unmarshal_List-16 176.2n ± 4% 172.9n ± 1% -1.90% (p=0.002 n=10)
Unmarshal_Set-16 177.7n ± 1% 172.9n ± 0% -2.67% (p=0.000 n=10)
Unmarshal_Map-16 374.0n ± 5% 348.0n ± 2% -6.94% (p=0.000 n=10)
Unmarshal_TupleStrings-16 387.9n ± 1% 206.1n ± 1% -46.85% (p=0.000 n=10)
Unmarshal_TupleInterfaces-16 485.4n ± 2% 289.6n ± 3% -40.34% (p=0.000 n=10)
FramerReadTypeInfo-16 207.8n ± 3% 168.6n ± 5% -18.89% (p=0.000 n=10)
ConnStress-16 14.10µ ± 16% 16.05µ ± 19% ~ (p=0.363 n=10)
ConnRoutingKey-16 275.3n ± 4% 276.9n ± 2% ~ (p=0.869 n=10)
WikiCreateSchema-16 515.9m ± 3% 484.9m ± 3% -6.02% (p=0.000 n=10)
WikiCreatePages-16 1.534m ± 1% 1.469m ± 1% -4.21% (p=0.000 n=10)
WikiSelectAllPages-16 1.982m ± 3% 1.900m ± 1% -4.13% (p=0.000 n=10)
WikiSelectSinglePage-16 1.529m ± 2% 1.472m ± 0% -3.72% (p=0.000 n=10)
WikiSelectPageCount-16 1.691m ± 2% 1.629m ± 2% -3.67% (p=0.000 n=10)
FramerReadCol-16 107.7n ± 3% 124.6n ± 1% +15.79% (p=0.000 n=10)
geomean 375.5n 391.1n -15.90% ¹
¹ benchmark set differs from baseline; geomeans may not be comparable
│ old.txt │ new3.txt │
│ B/op │ B/op vs base │
SingleConn-16 3.155Ki ± 0% 3.157Ki ± 0% +0.06% (p=0.000 n=10)
ParseRowsFrame-16 1128.0 ± 0% 872.0 ± 0% -22.70% (p=0.000 n=10)
UnmarshalVarchar-16 32.00 ± 0%
UnmarshalUUID-16 0.000 ± 0%
Unmarshal_BigInt-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_Blob-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_Boolean-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_Date-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_Decimal-16 96.00 ± 0% 48.00 ± 0% -50.00% (p=0.000 n=10)
Unmarshal_Double-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_Duration-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_Float-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_Int-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_Inet-16 4.000 ± 0% 4.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_SmallInt-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_Time-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_Timestamp-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_TinyInt-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_UUID-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_Varchar-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_List-16 32.00 ± 0% 32.00 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_Set-16 32.00 ± 0% 32.00 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_Map-16 280.0 ± 0% 280.0 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_TupleStrings-16 158.00 ± 0% 62.00 ± 0% -60.76% (p=0.000 n=10)
Unmarshal_TupleInterfaces-16 190.00 ± 0% 94.00 ± 0% -50.53% (p=0.000 n=10)
FramerReadTypeInfo-16 160.00 ± 0% 96.00 ± 0% -40.00% (p=0.000 n=10)
ConnStress-16 2.956Ki ± 0% 2.958Ki ± 0% ~ (p=0.238 n=10)
ConnRoutingKey-16 96.00 ± 0% 96.00 ± 0% ~ (p=1.000 n=10) ¹
WikiCreateSchema-16 59.90Ki ± 3% 57.90Ki ± 1% -3.34% (p=0.004 n=10)
WikiCreatePages-16 4.357Ki ± 0% 4.355Ki ± 0% ~ (p=0.236 n=10)
WikiSelectAllPages-16 38.32Ki ± 0% 38.32Ki ± 0% ~ (p=0.322 n=10)
WikiSelectSinglePage-16 3.783Ki ± 0% 3.781Ki ± 0% ~ (p=0.239 n=10)
WikiSelectPageCount-16 3.215Ki ± 0% 3.212Ki ± 0% ~ (p=0.232 n=10)
FramerReadCol-16 99.00 ± 0% 67.00 ± 0% -32.32% (p=0.000 n=10)
geomean ² -10.43% ³ ²
¹ all samples are equal
² summaries must be >0 to compute geomean
³ benchmark set differs from baseline; geomeans may not be comparable
│ old.txt │ new3.txt │
│ allocs/op │ allocs/op vs base │
SingleConn-16 37.00 ± 0% 37.00 ± 0% ~ (p=1.000 n=10) ¹
ParseRowsFrame-16 21.00 ± 0% 13.00 ± 0% -38.10% (p=0.000 n=10)
UnmarshalVarchar-16 1.000 ± 0%
UnmarshalUUID-16 0.000 ± 0%
Unmarshal_BigInt-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_Blob-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_Boolean-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_Date-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_Decimal-16 4.000 ± 0% 3.000 ± 0% -25.00% (p=0.000 n=10)
Unmarshal_Double-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_Duration-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_Float-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_Int-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_Inet-16 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_SmallInt-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_Time-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_Timestamp-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_TinyInt-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_UUID-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_Varchar-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_List-16 2.000 ± 0% 2.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_Set-16 2.000 ± 0% 2.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_Map-16 5.000 ± 0% 5.000 ± 0% ~ (p=1.000 n=10) ¹
Unmarshal_TupleStrings-16 8.000 ± 0% 4.000 ± 0% -50.00% (p=0.000 n=10)
Unmarshal_TupleInterfaces-16 10.000 ± 0% 6.000 ± 0% -40.00% (p=0.000 n=10)
FramerReadTypeInfo-16 4.000 ± 0% 4.000 ± 0% ~ (p=1.000 n=10) ¹
ConnStress-16 35.00 ± 0% 35.00 ± 0% ~ (p=1.000 n=10)
ConnRoutingKey-16 4.000 ± 0% 4.000 ± 0% ~ (p=1.000 n=10) ¹
WikiCreateSchema-16 858.0 ± 3% 746.0 ± 1% -13.05% (p=0.000 n=10)
WikiCreatePages-16 56.00 ± 2% 56.00 ± 2% ~ (p=1.000 n=10)
WikiSelectAllPages-16 343.0 ± 0% 343.0 ± 0% ~ (p=1.000 n=10) ¹
WikiSelectSinglePage-16 50.00 ± 0% 50.00 ± 0% ~ (p=1.000 n=10) ¹
WikiSelectPageCount-16 45.00 ± 0% 45.00 ± 0% ~ (p=1.000 n=10) ¹
FramerReadCol-16 3.000 ± 0% 3.000 ± 0% ~ (p=1.000 n=10) ¹
geomean ² -6.38% ³ ²
¹ all samples are equal
² summaries must be >0 to compute geomean
³ benchmark set differs from baseline; geomeans may not be comparable
Patch by James Hartig for CASSGO-43; reviewed by João Reis for CASSGO-43
83550bc to
7d8664a
Compare
CASSGO-43: externally-defined type registration
The new RegisteredTypes struct can be used to register externally-defined
types. You'll need to define your own marshalling and unmarshalling code
as well as a TypeInfo implementation. The name and id MUST not collide
with existing and future native CQL types.
A lot of the type handling was refactored to use the new format for
types.
inet columns are now unmarshaled as net.IP which is a breaking change.