Hello!
I ran across #196 when first investigating batch insert but performance between Model.insert_all and INSERT behavior with FORMAT JSONEachRow looks to be significant (at least locally for me).
With simple & small records (<10 Strings/UInt8/Float64 fields), 20k inserts sit at ~325ms with FORMAT JSONEachRow, while INSERT INTO {} VALUES () (stock insert_all behavior) sits at almost 5x this (~1550ms).
I was able to hack this in within my codebase by defining a new method on my abstract base class and digging into connection internals as a POC, but would love to understand if there's a desire to support this in a more official capacity.
Example:
class ClickhouseRecord < ActiveRecord::Base
self.abstract_class = true
establish_connection :clickhouse
def self.insert_all_json_each_row(rows)
if rows.blank?
return
end
conn = connection.instance_variable_get(:@connection)
connection_config = connection.instance_variable_get(:@connection_config).merge({
query: "INSERT INTO #{table_name} FORMAT JSONEachRow"
}).to_param
body = rows.map(&:to_json).join("\n")
res = conn.post("/?#{connection_config}", body)
connection.send(
:process_response,
res,
ActiveRecord::ConnectionAdapters::Clickhouse::SchemaStatements::DEFAULT_RESPONSE_FORMAT
)
end
end
Hello!
I ran across #196 when first investigating batch insert but performance between
Model.insert_allandINSERTbehavior withFORMAT JSONEachRowlooks to be significant (at least locally for me).With simple & small records (<10 Strings/UInt8/Float64 fields), 20k inserts sit at ~325ms with
FORMAT JSONEachRow, whileINSERT INTO {} VALUES ()(stockinsert_allbehavior) sits at almost 5x this (~1550ms).I was able to hack this in within my codebase by defining a new method on my abstract base class and digging into
connectioninternals as a POC, but would love to understand if there's a desire to support this in a more official capacity.Example: