Skip to content

Proper way to insert arrays #163

Open
@blacelle

Description

@blacelle

I'm being provided a bunch of String[] and double[] and I'm looking for the proper way to insert them efficiently into a DuckDB table.

I succeed inserting (primitive) arrays only by generating equivalent concatenated String. Am I missing a way to do so?

Relates with aggregating arrays


Appender: #110 confirms this is not supported by the Appender yet. The only way seems to going through a concatenated String:

DuckDBConnection duckDbConnection = (DuckDBConnection) c;

Statement s = duckDbConnection.createStatement();

s.execute("CREATE TABLE someTableName (color VARCHAR, array_keys STRING[%s], array_values DOUBLE[%s]);"
		.formatted(arrayLength, arrayLength));

try (var appender = duckDbConnection.createAppender(DuckDBConnection.DEFAULT_SCHEMA, tableName)) {
	for (int rowIndex = 0; rowIndex < 16; rowIndex++) {

		double rowFactor = Math.sqrt(rowIndex);

		appender.beginRow();
		appender.append("red");
		appender.append(IntStream.range(0, arrayLength).mapToObj(i -> "d" + i).collect(Collectors.joining(", ", "[", "]")));
		appender.append(IntStream.range(0, arrayLength).mapToDouble(i -> rowFactor * Math.sqrt(i)).mapToObj(d -> Double.toString(d)).collect(Collectors.joining(", ", "[", "]")));
		appender.endRow();
	}
}

BatchWriter: https://duckdb.org/docs/stable/clients/java.html#batch-writer I can not get it working:

DuckDBConnection duckDbConnection = (DuckDBConnection) c;

Statement s = duckDbConnection.createStatement();

s.execute("CREATE TABLE someTableName (color VARCHAR, array_keys STRING[%s], array_values DOUBLE[%s]);"
		.formatted(arrayLength, arrayLength));

PreparedStatement stmt = duckDbConnection
		.prepareStatement("INSERT INTO someTableName (color, array_keys, array_values) VALUES (?, ?, ?);");

stmt.setObject(1, "red");
stmt.setObject(2, IntStream.range(0, 255).mapToObj(i -> "d" + i).toArray(String[]::new));
stmt.setObject(3, IntStream.range(0, 255).mapToDouble(i -> Math.sqrt(i)).toArray());
stmt.addBatch();

stmt.executeBatch();
stmt.close();

I get:

org.jooq.exception.DataAccessException: Error while running ConnectionCallable
	at org.jooq.impl.DefaultDSLContext.connectionResult(DefaultDSLContext.java:663)
	at org.jooq.impl.DefaultDSLContext.connection(DefaultDSLContext.java:695)
	at eu.solven.adhoc.table.duckdb.quantile.TestTableQuery_DuckDb_VaR.feedTable(TestTableQuery_DuckDb_VaR.java:84)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
Caused by: java.sql.SQLException: Invalid Input Error: Unsupported parameter type
	at org.duckdb.DuckDBNative.duckdb_jdbc_execute(Native Method)
	at org.duckdb.DuckDBPreparedStatement.execute(DuckDBPreparedStatement.java:148)
	at org.duckdb.DuckDBPreparedStatement.executeBatchedPreparedStatement(DuckDBPreparedStatement.java:489)
	at org.duckdb.DuckDBPreparedStatement.executeBatch(DuckDBPreparedStatement.java:474)
	at eu.solven.adhoc.table.duckdb.quantile.TestTableQuery_DuckDb_VaR.lambda$1(TestTableQuery_DuckDb_VaR.java:141)
	at org.jooq.impl.DefaultDSLContext.lambda$connection$12(DefaultDSLContext.java:696)
	at org.jooq.impl.DefaultDSLContext.connectionResult(DefaultDSLContext.java:657)
	... 5 more

Arrow:

I tried getting through Arrow and ADBC, but I did not undertand how to insert data into DuckDB through any Arrow option.

https://duckdb.org/docs/stable/clients/adbc.html
https://duckdb.org/docs/stable/clients/java.html#arrow-import

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions