Skip to content

Data must be less than or equal to 1MB in size #39

Open
@jamesamurr-bind

Description

This is the sameor similar to the closed issue #14 #14. I can't reopen it myself.

In my use case I have a transaction that is larger than 1MB and this is causing pg2k4j to get stuck reading that same WAL message over and over. This causes my disk to fill up (because WAL is never successfully read) and the database crashes.

The recommendation in that issue was to set wal_writer_flush_after lower than 1mb and to set wal_writer_delay to something short.

I have set wal_writer_flush_after to 20 and set wal_writer_delay to 200. This does not resolve the problem. Do you have other suggestions?

A 1MB transaction is large, but is not an unreasonable use case. For example if I add a column to an existing table with a million records and then need to backfill data in that column in a transaction it would fail. Or if I had to bulk insert thousands of new members for my site, this would cause the same error. Because of this issue I can't use this software.

Here is my error:

2020-01-03T21:57:45.924263600Z [main] ERROR com.disneystreaming.pg2k4j.SlotReaderKinesisWriter - Received exception of type class java.lang.IllegalArgumentException
2020-01-03T21:57:45.924932900Z java.lang.IllegalArgumentException: Data must be less than or equal to 1MB in size, got 2346047 bytes
2020-01-03T21:57:45.924944900Z  at com.amazonaws.services.kinesis.producer.KinesisProducer.addUserRecord(KinesisProducer.java:517)
2020-01-03T21:57:45.924949500Z  at com.amazonaws.services.kinesis.producer.KinesisProducer.addUserRecord(KinesisProducer.java:406)
2020-01-03T21:57:45.924952800Z  at com.disneystreaming.pg2k4j.SlotReaderKinesisWriter.lambda$processByteBuffer$0(SlotReaderKinesisWriter.java:242)
2020-01-03T21:57:45.924956100Z  at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
2020-01-03T21:57:45.924959100Z  at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
2020-01-03T21:57:45.924970900Z  at java.base/java.util.stream.Streams$StreamBuilderImpl.forEachRemaining(Streams.java:411)
2020-01-03T21:57:45.924974400Z  at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
2020-01-03T21:57:45.924977500Z  at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
2020-01-03T21:57:45.924980500Z  at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
2020-01-03T21:57:45.924983600Z  at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
2020-01-03T21:57:45.924986600Z  at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
2020-01-03T21:57:45.924989600Z  at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
2020-01-03T21:57:45.924992700Z  at com.disneystreaming.pg2k4j.SlotReaderKinesisWriter.processByteBuffer(SlotReaderKinesisWriter.java:234)
2020-01-03T21:57:45.924995900Z  at com.disneystreaming.pg2k4j.SlotReaderKinesisWriter.readSlotWriteToKinesisHelper(SlotReaderKinesisWriter.java:195)
2020-01-03T21:57:45.924999700Z  at com.disneystreaming.pg2k4j.SlotReaderKinesisWriter.readSlotWriteToKinesis(SlotReaderKinesisWriter.java:131)
2020-01-03T21:57:45.925003500Z  at com.disneystreaming.pg2k4j.SlotReaderKinesisWriter.runLoop(SlotReaderKinesisWriter.java:86)
2020-01-03T21:57:45.925007300Z  at com.disneystreaming.pg2k4j.CommandLineRunner.run(CommandLineRunner.java:30)
2020-01-03T21:57:45.925011800Z  at java.base/java.util.Optional.ifPresent(Optional.java:183)
2020-01-03T21:57:45.925041500Z  at com.disneystreaming.pg2k4j.CommandLineRunner.main(CommandLineRunner.java:45)
2020-01-03T21:57:47.223098700Z [main] INFO com.disneystreaming.pg2k4j.PostgresConnector - Attempting to create replication slot pg2k4j
2020-01-03T21:57:47.271197900Z [main] INFO com.disneystreaming.pg2k4j.PostgresConnector - Slot pg2k4j already exists
2020-01-03T21:57:47.326846300Z [main] INFO com.amazonaws.services.kinesis.producer.KinesisProducer - Extracting binaries to /tmp/amazon-kinesis-producer-native-binaries
2020-01-03T21:57:47.819365800Z [main] INFO com.amazonaws.services.kinesis.producer.HashedFileCopier - '/tmp/amazon-kinesis-producer-native-binaries/kinesis_producer_489FA9AC71B1CD61A4002E9F16A279556D581D9D' already exists, and matches.  Not overwriting.
2020-01-03T21:57:47.829184100Z [main] INFO com.disneystreaming.pg2k4j.SlotReaderKinesisWriter - Consuming from slot pg2k4j

To reproduce:

  1. Run pg2k4j against a database
  2. Create a table with the following DDL
CREATE TABLE public.my_test_table (
	id serial,
	"name" varchar(100) NOT NULL,
	property_1 varchar(200) not null,
	property_2 varchar(200) not null,
	property_3 varchar(200) not null,
	property_4 varchar(200) not null,
	property_5 varchar(200) not null,
	property_6 varchar(200) not null,
	property_7 varchar(200) not null,
	property_8 varchar(200) not null,
	property_9 varchar(200) not null,
	CONSTRAINT my_test_table_pkey PRIMARY KEY (id)
);
  1. Create an insert script similar to the following with 3500 inserts.
BEGIN;
INSERT INTO opportunity.public.james_test_table
("name", property_1, property_2, property_3, property_4, property_5, property_6, property_7, property_8, property_9)
VALUES(concat(md5(random()::text), md5(random()::text)), concat(md5(random()::text), md5(random()::text)), concat(md5(random()::text), md5(random()::text)), concat(md5(random()::text), md5(random()::text)), concat(md5(random()::text), md5(random()::text)), concat(md5(random()::text), md5(random()::text)), concat(md5(random()::text), md5(random()::text)), concat(md5(random()::text), md5(random()::text)), concat(md5(random()::text), md5(random()::text)), concat(md5(random()::text), md5(random()::text)));
...
< add 3500 more of the previous insert statement here>
...
COMMIT;
  1. Run the insert script against the postgres database that is running pg2k4j
psql --host=my.ip.address --port=5447 --username=test_user --dbname=test_database --file "C:\dev\sql_scripts\my_temp_table.sql"
  1. Watch the logs on your pg2k4j container and wait for the transaction to finish. Once it is finished you will see the error.
docker logs reverent_pike --since 10m -t --follow

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions