For example, we have a table:
CREATE TABLE IF NOT EXISTS gedaqtests.byteuint8e0m
(
m_id Int32,
m_value UInt8,
m_value_second UInt8,
m_value_another UInt8,
)
ENGINE = MergeTree
PARTITION BY (intHash32(m_id) % 5)
ORDER BY m_id
SETTINGS index_granularity = 8192;
Let's say we need to insert data into Id + Value.
To insert data into it, we currently have two options:
Option 1:
var rows = new List<object[]>
{
new object[] { (int)12, (byte)5 }, // row1
new object[] { (int)12, (byte)5 }, // row2
//...
};
long rowsInserted = await client.InsertBinaryAsync("byteuint8e0m", ["m_id", "m_value"], rows);
object[] is a problem, we will have to pack int and byte into object[] and this is get us boxing.
Since CLickhouse typically inserts very large batches of data, this can cause performance issues.
Option 2:
public class Row1
{
[ClickHouseColumn(Type = "Int32", Name = "m_id")]
public Int32 Id { get; set; }
[ClickHouseColumn(Type = "UInt8", Name = "m_value")]
public System.Byte Value { get; set; }
[ClickHouseNotMapped()]
public System.Byte Second { get; set; }
[ClickHouseNotMapped()]
public System.Byte Another { get; set; }
}
client.RegisterBinaryInsertType<Row>();
var list = new List<Row1>()
{
new Row1
{
Id = 12,
Value = 5
},
new Row1
{
Id = 15,
Value = 6
}
};
long rowsInserted = await client.InsertBinaryAsync("byteuint8e0m", list);
There are even more problems here:
-
In each case, we need to write Row1(m_id + m_value), Row2(m_id + m_value_second), Row3... . In applications, the table has many columns, and it's not always clear in advance which data will be in a row: "m_id + m_value" or "m_id + m_value_second" or another combination of columns. Writing POCO classes for all combinations is impossible.
-
Attributes don't work when building AOT - the application simply crashes, apparently because reflection occurs. I haven't tested it in detail, but it looks like AOT isn't supported because of the BuildMapping method and how it's implemented, but that might not be the case.
A third method is needed that will avoid these limitations. Perhaps it will be more manual. Something like this:
var import = await client.StartInsertAsync("byteuint8e0m", ["m_id", "m_value"]);
using var batch1 = import.StartNewBatch();
// fill batch 1
batch1.WriteData(0, 12);
batch1.WriteData(1, 5);
batch2.CompleteWrite();
var sendTask = import.SendBatchAsync(batch1);
using var batch2 = import.StartNewBatch();
// fill batch 2
batch2.WriteData(0, 15);
batch2.WriteData(1, 6);
batch2.CompleteWrite();
var sendTask2 = import.SendBatchAsync(batch2);
await Task.WhenAll(sendTask, sendTask2);
In this case, the ORM, not the driver developer, will handle the automation. Am I missing something? Are there existing ways to avoid the problems I encountered above?
For example, we have a table:
Let's say we need to insert data into Id + Value.
To insert data into it, we currently have two options:
Option 1:
object[] is a problem, we will have to pack
intandbyteinto object[] and this is get us boxing.Since CLickhouse typically inserts very large batches of data, this can cause performance issues.
Option 2:
There are even more problems here:
In each case, we need to write Row1(m_id + m_value), Row2(m_id + m_value_second), Row3... . In applications, the table has many columns, and it's not always clear in advance which data will be in a row: "m_id + m_value" or "m_id + m_value_second" or another combination of columns. Writing POCO classes for all combinations is impossible.
Attributes don't work when building AOT - the application simply crashes, apparently because reflection occurs. I haven't tested it in detail, but it looks like AOT isn't supported because of the BuildMapping method and how it's implemented, but that might not be the case.
A third method is needed that will avoid these limitations. Perhaps it will be more manual. Something like this:
In this case, the ORM, not the driver developer, will handle the automation. Am I missing something? Are there existing ways to avoid the problems I encountered above?