Skip to content

[BUG][connector-file-base] Error reading int32 field value in parquet file #9141

Closed
@JeremyXin

Description

@JeremyXin

Search before asking

  • I had searched in the issues and found no similar issues.

What happened

When reading a parquet file, for data whose type is INT32, if OriginType of type is not null, there is no 'case INT32' branch under the switch branch, causing the task to report a conversion exception for SeaTunnelType.

SeaTunnel Version

2.3.8

SeaTunnel Config

env {
  parallelism = 1
  job.mode = "BATCH"
}
source{
  LocalFile {
      path = "xxx/2747ac57674b2061-aa6925dc00000039_1464883239_data.0.parq"
      file_format_type = "parquet"
  }
}

sink {
  Console {
  }
}

Running Command

sh bin/seatunnel.sh --config config/v2.batch.hdfs.template -m local

Error Exception

Caused by: org.apache.seatunnel.common.exception.SeaTunnelRuntimeException: ErrorCode:[COMMON-20], ErrorDescription:['Parquet' table 'default.default.default' unsupported get catalog table with field data types '{"sec_inner":"optional int32 sec_inner (INTEGER(32,true))"}']
	at org.apache.seatunnel.common.exception.CommonError.getCatalogTableWithUnsupportedType(CommonError.java:174)
	at org.apache.seatunnel.connectors.seatunnel.file.source.reader.ReadStrategy.buildColumnsWithErrorCheck(ReadStrategy.java:86)
	at org.apache.seatunnel.connectors.seatunnel.file.source.reader.ParquetReadStrategy.getSeaTunnelRowTypeInfoWithUserConfigRowType(ParquetReadStrategy.java:327)
	at org.apache.seatunnel.connectors.seatunnel.file.config.BaseFileSourceConfig.parseCatalogTable(BaseFileSourceConfig.java:105)
	at org.apache.seatunnel.connectors.seatunnel.file.config.BaseFileSourceConfig.<init>(BaseFileSourceConfig.java:66)
	at org.apache.seatunnel.connectors.seatunnel.file.local.source.config.LocalFileSourceConfig.<init>(LocalFileSourceConfig.java:44)
	at org.apache.seatunnel.connectors.seatunnel.file.local.source.config.MultipleTableLocalFileSourceConfig.getBaseSourceConfig(MultipleTableLocalFileSourceConfig.java:32)
	at org.apache.seatunnel.connectors.seatunnel.file.config.BaseMultipleTableFileSourceConfig.parseFromFileSourceConfig(BaseMultipleTableFileSourceConfig.java:54)
	at org.apache.seatunnel.connectors.seatunnel.file.config.BaseMultipleTableFileSourceConfig.<init>(BaseMultipleTableFileSourceConfig.java:39)
	at org.apache.seatunnel.connectors.seatunnel.file.local.source.config.MultipleTableLocalFileSourceConfig.<init>(MultipleTableLocalFileSourceConfig.java:27)
	at org.apache.seatunnel.connectors.seatunnel.file.local.source.LocalFileSource.<init>(LocalFileSource.java:28)
	at org.apache.seatunnel.connectors.seatunnel.file.local.source.LocalFileSourceFactory.lambda$createSource$0(LocalFileSourceFactory.java:47)
	at org.apache.seatunnel.api.table.factory.FactoryUtil.createAndPrepareSource(FactoryUtil.java:113)
	at org.apache.seatunnel.api.table.factory.FactoryUtil.createAndPrepareSource(FactoryUtil.java:74)
	... 7 more

Zeta or Flink or Spark Version

No response

Java or Scala Version

No response

Screenshots

Image

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions