-
Notifications
You must be signed in to change notification settings - Fork 45
Add MariaDB migration code written by Claude #278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
build fail, fix it first |
| // ScyllaDBConnection Implementation | ||
| // ============================================================================ | ||
|
|
||
| ScyllaDBConnection::ScyllaDBConnection(const ScyllaDBConfig& config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eh ?
Scylla is supported in spark, we don't need a cpp connection to it
not mentioning cpp driver for scylla is obsolete and it will be replaced by its cpp-over-rust variant
so this code is completely useless and unoptimal
| * Licensed under Apache License 2.0 | ||
| */ | ||
|
|
||
| #include "mariadb_scylla_migrator.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it hard to believe mariadb doesn't have a spark DF connector
seems it does : https://mariadb.com/ja/resources/blog/hands-on-mariadb-columnstore-spark-connector/
maybe that would be more usefull than cpp native call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @tarzanek ,
MariaDB Columnstore is different from native MariaDB. That blog post is also from 2018, which is ancient history in terms of MariaDB Columnstore maturity. That was back when Columnstore was just rebranded InfiniDB.
I chose MariaDB Connector/C for this , because it is the only MariaDB connector that has an API for the binlog.
Perhaps the binlog streaming/applying functionality should be separate from the Spark functionality. That would allow us to use something like MariaDB Connector/J for the spark functionality, but still use MariaDB Connector/C for the binlog streaming/applying functionality. The binlog stuff probably has to occur on one node at a time anyway. It's not like it can be divided up and given to multiple workers, because commit ordering is very important.
I'd love to meet with you sometime and discuss the best way to implement all of this. Let me know if you're down.
Thanks!
tarzanek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compiling a native library for spark seems like an overkill
(executors can be heterogeneous , so what will be the same is JDK version, so ideally is to build against it and not rely on below OS and its libs)
This was entirely written by Claude. We might need to fix some stuff. It's more of an intellectual exercise than a production-ready feature.