forked from ZJONSSON/parquetjs
-
Notifications
You must be signed in to change notification settings - Fork 29
Open
Description
Hi, after hours of trying I decided to open an issue here.
I'm trying to use this library to convert data from JSON -> parquet. This is my minimalized code:
import {ParquetSchema, ParquetWriter} from "@dsnp/parquetjs";
const schema = new ParquetSchema({
some: {type: 'UTF8'},
test: {type: 'UTF8'},
});
async function run() {
const writer = await ParquetWriter.openFile(schema, 'test.parquet');
await writer.appendRow({
some: 'data',
test: 'this'
});
await writer.close();
}
run();I saved the file as test.ts and run it via npx tsx .\test.ts on my windows 11 machine with node v22.
It only gives me this output:
> npx tsx .\test.ts
node_modules\thrift\lib\nodejs\lib\thrift\compact_protocol.js:553
throw new Thrift.TProtocolException(Thrift.TProtocolExceptionType.INVALID_DATA, "Expected Int64 or Number, found: " + l);
^
TProtocolException: Expected Int64 or Number, found: 0
at TCompactProtocol.i64ToZigzag (node_modules\thrift\lib\nodejs\lib\thrift\compact_protocol.js:553:11)
at TCompactProtocol.writeI64 (node_modules\thrift\lib\nodejs\lib\thrift\compact_protocol.js:365:27)
at Statistics.write (node_modules\@dsnp\parquetjs\dist\gen-nodejs\parquet_types.js:192:16)
at DataPageHeaderV2.write (node_modules\@dsnp\parquetjs\dist\gen-nodejs\parquet_types.js:1730:25)
at PageHeader.write (node_modules\@dsnp\parquetjs\dist\gen-nodejs\parquet_types.js:2239:34)
at Object.serializeThrift (node_modules\@dsnp\parquetjs\dist\lib\util.js:85:9)
at encodeDataPageV2 (node_modules\@dsnp\parquetjs\dist\lib\writer.js:520:40)
at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
at async encodePages (node_modules\@dsnp\parquetjs\dist\lib\writer.js:415:20)
at async ParquetWriter.close (node_modules\@dsnp\parquetjs\dist\lib\writer.js:151:17) {
type: 1
}
The file test.parquet gets created, but it only creates the string PAR1 and nothing else.
Am I doing something wrong? Is the library not supposed to create new parquet files and only to append to existing ones? Or is this a bug?
Metadata
Metadata
Assignees
Labels
No labels