Skip to content

Allow to reuse batch if append failed #1223

Open
@JILeXanDR

Description

@JILeXanDR

Describe the bug

Steps to reproduce

  1. Prepare valid and bad data in any order
  2. Prepare batch
  3. Call AppendStruct in loop
  4. See logs

Expected behaviour

Currently it's not possible to understand what row is corrupted, even if 1 row of 10000 will have some invalid value, it affects all next data in the current batch.

There are 2 possible ways to give some flexibility:

  1. when AppendStruct detects any problem with current struct data it returns an error and doesn't append corrupted values to batch, developer must decide what to do with that error on his own, but it should be possible to continue and skip that rows

  2. when AppendStruct detects any problem with current struct data it returns an error, appends row like now, but any next calls of AppendStruct with valid data will be succeed

Code example

func AppendStructWithBadData() error {
	conn, err := GetNativeConnection(nil, nil, nil)
	if err != nil {
		return err
	}
	ctx := context.Background()
	defer func() {
		conn.Exec(ctx, "DROP TABLE example")
	}()
	if err := conn.Exec(ctx, `DROP TABLE IF EXISTS example`); err != nil {
		return err
	}
	if err := conn.Exec(ctx, `
		CREATE TABLE example (
			  Col1 String
			, Col2 DateTime
		) Engine = Memory
		`); err != nil {
		return err
	}

	batch, err := conn.PrepareBatch(context.Background(), "INSERT INTO example")
	if err != nil {
		return err
	}

	data := []struct {
		Col1 string
		Col2 time.Time
	}{
		{
			Col1: "valid data", // no error
			Col2: time.Now(),
		},
		{
			Col1: "bad data", // error=clickhouse: dateTime overflow. Col2 must be between 1970-01-01 00:00:00 and 2105-12-31 23:59:59
			Col2: time.Time{},
		},
		{
			Col1: "valid data", // error=clickhouse: dateTime overflow. Col2 must be between 1970-01-01 00:00:00 and 2105-12-31 23:59:59: clickhouse: batch is invalid. check appended data is correct
			Col2: time.Now(),
		},
	}

	for i, r := range data {
		err := batch.AppendStruct(&r)
		if err != nil {
			fmt.Printf("AppendStruct failed: index=%d, error=%+v\n", i, err.Error())
		} else {
			fmt.Printf("AppendStruct succed: index=%d\n", i)
		}
	}

	fmt.Printf("send batch: rows=%d\n", batch.Rows())

	return batch.Send()
}

Error log

AppendStruct succed: index=0
AppendStruct failed: index=1, error=clickhouse: dateTime overflow. Col2 must be between 1970-01-01 00:00:00 and 2105-12-31 23:59:59
AppendStruct failed: index=2, error=clickhouse: dateTime overflow. Col2 must be between 1970-01-01 00:00:00 and 2105-12-31 23:59:59: clickhouse: batch is invalid. check appended data is correct

send batch: rows=2

        	Error:      	Received unexpected error:
        	            	clickhouse: batch is invalid. check appended data is correct
        	            	clickhouse: dateTime overflow. Col2 must be between 1970-01-01 00:00:00 and 2105-12-31 23:59:59

Configuration

Environment

  • Client version:
  • Language version:
  • OS:
  • Interface: ClickHouse API / database/sql compatible driver

ClickHouse server

  • ClickHouse Server version:
  • ClickHouse Server non-default settings, if any:
  • CREATE TABLE statements for tables involved:
  • Sample data for all these tables, use clickhouse-obfuscator if necessary

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions