Skip to content

[BUG] Error in Backfill migration from ES_6.7 to OS_2.17 #1180

@rudney-souza

Description

@rudney-souza

What is the bug?

When running the backfill migration from ES_6.7 to OS_2.17 i got an error like com.fasterxml.jackson.core.JsonParseException: Unrecognized token 'Too': was expecting (JSON String, Number, Array, Object or token 'null', 'true' or 'false'), and was stuck in the same number of shards even after running all night long (at first were 50 workers and then ive added 130 to see if something changes, but none of remaining shards has finished yet)

How can one reproduce the bug?

running the backfill migration from ES_6.7 to OS_2.17

What is the expected behavior?

Run the migration without errors

What is your host/environment?

ES_6.7 to OS_2.17

Do you have any additional context?

2024-12-05 14:00:51,114 WARN o.o.m.b.c.OpenSearchClient [reactor-http-epoll-2] Unable to process bulk request for success com.fasterxml.jackson.core.JsonParseException: Unrecognized token 'Too': was expecting (JSON String, Number, Array, Object or token 'null', 'true' or 'false') at [Source: REDACTED (StreamReadFeature.INCLUDE_SOURCE_IN_LOCATION disabled); line: 1, column: 4] at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:2481) ~[jackson-core-2.16.2.jar:2.16.2] at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:762) ~[jackson-core-2.16.2.jar:2.16.2] at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._reportInvalidToken(ReaderBasedJsonParser.java:3042) ~[jackson-core-2.16.2.jar:2.16.2] at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._handleOddValue(ReaderBasedJsonParser.java:2085) ~[jackson-core-2.16.2.jar:2.16.2] at com.fasterxml.jackson.core.json.ReaderBasedJsonParser.nextToken(ReaderBasedJsonParser.java:812) ~[jackson-core-2.16.2.jar:2.16.2] at org.opensearch.migrations.parsing.BulkResponseParser.findSuccessDocs(BulkResponseParser.java:39) ~[RFS-0.1.0-SNAPSHOT.jar:?] at org.opensearch.migrations.bulkload.common.OpenSearchClient$BulkResponse.getSuccessfulDocs(OpenSearchClient.java:531) ~[RFS-0.1.0-SNAPSHOT.jar:?] at org.opensearch.migrations.bulkload.common.OpenSearchClient.lambda$sendBulkRequest$28(OpenSearchClient.java:474) ~[RFS-0.1.0-SNAPSHOT.jar:?] at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:132) [reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.MonoPeekTerminal$MonoTerminalPeekSubscriber.onNext(MonoPeekTerminal.java:180) [reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.MonoPeekTerminal$MonoTerminalPeekSubscriber.onNext(MonoPeekTerminal.java:180) [reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.MonoFlatMap$FlatMapMain.secondComplete(MonoFlatMap.java:245) [reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.MonoFlatMap$FlatMapInner.onNext(MonoFlatMap.java:305) [reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.MonoFlatMap$FlatMapMain.secondComplete(MonoFlatMap.java:245) [reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.MonoFlatMap$FlatMapInner.onNext(MonoFlatMap.java:305) [reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.FluxContextWrite$ContextWriteSubscriber.onNext(FluxContextWrite.java:107) [reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.FluxDoFinally$DoFinallySubscriber.onNext(FluxDoFinally.java:113) [reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.FluxMap$MapConditionalSubscriber.onNext(FluxMap.java:224) [reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.Operators$MonoInnerProducerBase.complete(Operators.java:2812) [reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.MonoSingleOptional$SingleOptionalSubscriber.onNext(MonoSingleOptional.java:101) [reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.FluxHandle$HandleSubscriber.onNext(FluxHandle.java:129) [reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.FluxMap$MapConditionalSubscriber.onNext(FluxMap.java:224) [reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.FluxDoFinally$DoFinallySubscriber.onNext(FluxDoFinally.java:113) [reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.FluxHandleFuseable$HandleFuseableSubscriber.onNext(FluxHandleFuseable.java:194) [reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.FluxContextWrite$ContextWriteSubscriber.onNext(FluxContextWrite.java:107) [reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.Operators$BaseFluxToMonoOperator.completePossiblyEmpty(Operators.java:2097) [reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.MonoCollectList$MonoCollectListSubscriber.onComplete(MonoCollectList.java:118) [reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.FluxPeek$PeekSubscriber.onComplete(FluxPeek.java:260) [reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.FluxMap$MapSubscriber.onComplete(FluxMap.java:144) [reactor-core-3.6.5.jar:3.6.5] at reactor.netty.channel.FluxReceive.onInboundComplete(FluxReceive.java:415) [reactor-netty-core-1.1.18.jar:1.1.18] at reactor.netty.channel.ChannelOperations.onInboundComplete(ChannelOperations.java:446) [reactor-netty-core-1.1.18.jar:1.1.18] at reactor.netty.channel.ChannelOperations.terminate(ChannelOperations.java:500) [reactor-netty-core-1.1.18.jar:1.1.18] at reactor.netty.http.client.HttpClientOperations.onInboundNext(HttpClientOperations.java:793) [reactor-netty-http-1.1.18.jar:1.1.18] at reactor.netty.channel.ChannelOperationsHandler.channelRead(ChannelOperationsHandler.java:114) [reactor-netty-core-1.1.18.jar:1.1.18] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) [netty-transport-4.1.108.Final.jar:4.1.108.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) [netty-transport-4.1.108.Final.jar:4.1.108.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) [netty-transport-4.1.108.Final.jar:4.1.108.Final] at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436) [netty-transport-4.1.108.Final.jar:4.1.108.Final] at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:346) [netty-codec-4.1.108.Final.jar:4.1.108.Final] at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:318) [netty-codec-4.1.108.Final.jar:4.1.108.Final] at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251) [netty-transport-4.1.108.Final.jar:4.1.108.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442) [netty-transport-4.1.108.Final.jar:4.1.108.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) [netty-transport-4.1.108.Final.jar:4.1.108.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) [netty-transport-4.1.108.Final.jar:4.1.108.Final] at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1475) [netty-handler-4.1.108.Final.jar:4.1.108.Final] at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1338) [netty-handler-4.1.108.Final.jar:4.1.108.Final] at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1387) [netty-handler-4.1.108.Final.jar:4.1.108.Final] at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:530) [netty-codec-4.1.108.Final.jar:4.1.108.Final] at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:469) [netty-codec-4.1.108.Final.jar:4.1.108.Final] at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290) [netty-codec-4.1.108.Final.jar:4.1.108.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) [netty-transport-4.1.108.Final.jar:4.1.108.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) [netty-transport-4.1.108.Final.jar:4.1.108.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) [netty-transport-4.1.108.Final.jar:4.1.108.Final] at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:93) [netty-transport-4.1.108.Final.jar:4.1.108.Final] at org.opensearch.migrations.bulkload.netty.ReadMeteringHandler.channelRead(ReadMeteringHandler.java:26) [RFS-0.1.0-SNAPSHOT.jar:?] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) [netty-transport-4.1.108.Final.jar:4.1.108.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) [netty-transport-4.1.108.Final.jar:4.1.108.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) [netty-transport-4.1.108.Final.jar:4.1.108.Final] at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) [netty-transport-4.1.108.Final.jar:4.1.108.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440) [netty-transport-4.1.108.Final.jar:4.1.108.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) [netty-transport-4.1.108.Final.jar:4.1.108.Final] at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [netty-transport-4.1.108.Final.jar:4.1.108.Final] at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:801) [netty-transport-classes-epoll-4.1.108.Final.jar:4.1.108.Final] at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:509) [netty-transport-classes-epoll-4.1.108.Final.jar:4.1.108.Final] at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:407) [netty-transport-classes-epoll-4.1.108.Final.jar:4.1.108.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) [netty-common-4.1.108.Final.jar:4.1.108.Final] at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.108.Final.jar:4.1.108.Final] at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.108.Final.jar:4.1.108.Final] at java.base/java.lang.Thread.run(Thread.java:829) [?:?]


2024-12-05 14:00:48,025 ERROR r.c.s.Schedulers [parallel-3] Scheduler worker in group main failed with an uncaught exception java.lang.OutOfMemoryError: Java heap space at java.base/java.util.Arrays.copyOfRange(Arrays.java:4030) ~[?:?] at java.base/java.lang.StringLatin1.newString(StringLatin1.java:715) ~[?:?] at java.base/java.lang.StringBuilder.toString(StringBuilder.java:452) ~[?:?] at com.fasterxml.jackson.core.util.TextBuffer.contentsAsString(TextBuffer.java:498) ~[jackson-core-2.16.2.jar:2.16.2] at [com.fasterxml.jackson.core.io](http://com.fasterxml.jackson.core.io/).SegmentedStringWriter.getAndClear(SegmentedStringWriter.java:99) ~[jackson-core-2.16.2.jar:2.16.2] at org.opensearch.migrations.bulkload.common.BulkDocSection.convertToBulkRequestBody(BulkDocSection.java:72) ~[RFS-0.1.0-SNAPSHOT.jar:?] at org.opensearch.migrations.bulkload.common.OpenSearchClient.lambda$sendBulkRequest$29(OpenSearchClient.java:457) ~[RFS-0.1.0-SNAPSHOT.jar:?] at org.opensearch.migrations.bulkload.common.OpenSearchClient$$Lambda$1253/0x00000008007f5040.get(Unknown Source) ~[?:?] at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:45) ~[reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.FluxRetryWhen$RetryWhenMainSubscriber.resubscribe(FluxRetryWhen.java:220) ~[reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.FluxRetryWhen$RetryWhenOtherSubscriber.onNext(FluxRetryWhen.java:274) ~[reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.FluxContextWrite$ContextWriteSubscriber.onNext(FluxContextWrite.java:107) ~[reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.FluxConcatMapNoPrefetch$FluxConcatMapNoPrefetchSubscriber.innerNext(FluxConcatMapNoPrefetch.java:259) ~[reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.FluxConcatMap$ConcatMapInner.onNext(FluxConcatMap.java:865) ~[reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.FluxContextWrite$ContextWriteSubscriber.onNext(FluxContextWrite.java:107) ~[reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.MonoFlatMap$FlatMapMain.secondComplete(MonoFlatMap.java:245) ~[reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.MonoFlatMap$FlatMapInner.onNext(MonoFlatMap.java:305) ~[reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.complete(MonoIgnoreThen.java:294) ~[reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.onNext(MonoIgnoreThen.java:188) ~[reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.subscribeNext(MonoIgnoreThen.java:237) ~[reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.MonoIgnoreThen.subscribe(MonoIgnoreThen.java:51) ~[reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:165) ~[reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.complete(MonoIgnoreThen.java:294) ~[reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.onNext(MonoIgnoreThen.java:188) ~[reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.MonoDelay$MonoDelayRunnable.propagateDelay(MonoDelay.java:270) ~[reactor-core-3.6.5.jar:3.6.5] at reactor.core.publisher.MonoDelay$MonoDelayRunnable.run(MonoDelay.java:285) ~[reactor-core-3.6.5.jar:3.6.5] at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:68) ~[reactor-core-3.6.5.jar:3.6.5] at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:28) ~[reactor-core-3.6.5.jar:3.6.5] at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) ~[?:?] at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?] at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]


2024-12-05 14:00:47,618 WARN o.o.m.b.c.OpenSearchClient [reactor-http-epoll-4] After bulk request on index 'vindex', 0 more documents have succeed, 3614 remain

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions