Description
If a publication request is rejected by the remote node then today we always log it at DEBUG
. Mostly that makes sense, because a rejection in this situation is typically because of some other hopefully-transient issue (e.g. a network disconnect or near to a contended election). However, it could also indicate some more serious problem if it's not one of these expected/transient cases. For instance I saw a cluster persistently failing to write the cluster state because of a fundamental internal error, and this deserves much more noisy logging:
Click to expand
[instance-0000000046] PublishResponseHandler: [REDACTED] failed
org.elasticsearch.transport.RemoteTransportException: [REDACTED][internal:cluster/coordination/publish_state]
Caused by: java.lang.IllegalArgumentException: Nul character not allowed: _2zc1_Lucene90FieldsIndex�_0.tmp
at sun.nio.fs.UnixPath.checkNotNul(UnixPath.java:98) ~[?:?]
at sun.nio.fs.UnixPath.normalizeAndCheck(UnixPath.java:88) ~[?:?]
at sun.nio.fs.UnixPath.<init>(UnixPath.java:76) ~[?:?]
at sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:312) ~[?:?]
at java.nio.file.Path.resolve(Path.java:516) ~[?:?]
at org.apache.lucene.store.FSDirectory$FSIndexOutput.<init>(FSDirectory.java:392) ~[lucene-core-9.7.0.jar:?]
at org.apache.lucene.store.FSDirectory.createTempOutput(FSDirectory.java:234) ~[lucene-core-9.7.0.jar:?]
at org.apache.lucene.store.FilterDirectory.createTempOutput(FilterDirectory.java:81) ~[lucene-core-9.7.0.jar:?]
at org.apache.lucene.store.TrackingDirectoryWrapper.createTempOutput(TrackingDirectoryWrapper.java:49) ~[lucene-core-9.7.0.jar:?]
at org.apache.lucene.codecs.lucene90.compressing.FieldsIndexWriter.<init>(FieldsIndexWriter.java:83) ~[lucene-core-9.7.0.jar:?]
at org.apache.lucene.codecs.lucene90.compressing.Lucene90CompressingStoredFieldsWriter.<init>(Lucene90CompressingStoredFieldsWriter.java:148) ~[lucene-core-9.7.0.jar:?]
at org.apache.lucene.codecs.lucene90.compressing.Lucene90CompressingStoredFieldsFormat.fieldsWriter(Lucene90CompressingStoredFieldsFormat.java:140) ~[lucene-core-9.7.0.jar:?]
at org.apache.lucene.codecs.lucene90.Lucene90StoredFieldsFormat.fieldsWriter(Lucene90StoredFieldsFormat.java:154) ~[lucene-core-9.7.0.jar:?]
at org.apache.lucene.index.StoredFieldsConsumer.initStoredFieldsWriter(StoredFieldsConsumer.java:50) ~[lucene-core-9.7.0.jar:?]
at org.apache.lucene.index.StoredFieldsConsumer.startDocument(StoredFieldsConsumer.java:57) ~[lucene-core-9.7.0.jar:?]
at org.apache.lucene.index.IndexingChain.startStoredFields(IndexingChain.java:512) ~[lucene-core-9.7.0.jar:?]
at org.apache.lucene.index.IndexingChain.processDocument(IndexingChain.java:543) ~[lucene-core-9.7.0.jar:?]
at org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:242) ~[lucene-core-9.7.0.jar:?]
at org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:432) ~[lucene-core-9.7.0.jar:?]
at org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1545) ~[lucene-core-9.7.0.jar:?]
at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1830) ~[lucene-core-9.7.0.jar:?]
at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1470) ~[lucene-core-9.7.0.jar:?]
at org.elasticsearch.gateway.PersistedClusterStateService$Writer.lambda$addGlobalMetadataDocuments$6(PersistedClusterStateService.java:1162) ~[elasticsearch-8.10.4.jar:?]
at org.elasticsearch.gateway.PersistedClusterStateService$PageWriterOutputStream.flushPage(PersistedClusterStateService.java:1439) ~[elasticsearch-8.10.4.jar:?]
at org.elasticsearch.gateway.PersistedClusterStateService$PageWriterOutputStream.close(PersistedClusterStateService.java:1426) ~[elasticsearch-8.10.4.jar:?]
at java.util.zip.DeflaterOutputStream.close(DeflaterOutputStream.java:249) ~[?:?]
at org.elasticsearch.common.compress.DeflateCompressor$2.close(DeflateCompressor.java:186) ~[elasticsearch-8.10.4.jar:?]
at java.io.FilterOutputStream.close(FilterOutputStream.java:190) ~[?:?]
at com.fasterxml.jackson.dataformat.smile.SmileGenerator.close(SmileGenerator.java:1894) ~[?:?]
at org.elasticsearch.xcontent.provider.json.JsonXContentGenerator.close(JsonXContentGenerator.java:567) ~[?:?]
at org.elasticsearch.xcontent.XContentBuilder.close(XContentBuilder.java:1252) ~[elasticsearch-x-content-8.10.4.jar:?]
at org.elasticsearch.gateway.PersistedClusterStateService$Writer.writePages(PersistedClusterStateService.java:1176) ~[elasticsearch-8.10.4.jar:?]
at org.elasticsearch.gateway.PersistedClusterStateService$Writer.addGlobalMetadataDocuments(PersistedClusterStateService.java:1155) ~[elasticsearch-8.10.4.jar:?]
at org.elasticsearch.gateway.PersistedClusterStateService$Writer.addMetadata(PersistedClusterStateService.java:1193) ~[elasticsearch-8.10.4.jar:?]
at org.elasticsearch.gateway.PersistedClusterStateService$Writer.overwriteMetadata(PersistedClusterStateService.java:1186) ~[elasticsearch-8.10.4.jar:?]
at org.elasticsearch.gateway.PersistedClusterStateService$Writer.writeFullStateAndCommit(PersistedClusterStateService.java:933) ~[elasticsearch-8.10.4.jar:?]
at org.elasticsearch.gateway.GatewayMetaState$LucenePersistedState.writeClusterStateToDisk(GatewayMetaState.java:569) ~[elasticsearch-8.10.4.jar:?]
at org.elasticsearch.gateway.GatewayMetaState$LucenePersistedState.setLastAcceptedState(GatewayMetaState.java:562) ~[elasticsearch-8.10.4.jar:?]
at org.elasticsearch.cluster.coordination.CoordinationState.handlePublishRequest(CoordinationState.java:392) ~[elasticsearch-8.10.4.jar:?]
at org.elasticsearch.cluster.coordination.Coordinator.handlePublishRequest(Coordinator.java:469) ~[elasticsearch-8.10.4.jar:?]
at org.elasticsearch.cluster.coordination.PublicationTransportHandler.acceptState(PublicationTransportHandler.java:213) ~[elasticsearch-8.10.4.jar:?]
at org.elasticsearch.cluster.coordination.PublicationTransportHandler.handleIncomingPublishRequest(PublicationTransportHandler.java:148) ~[elasticsearch-8.10.4.jar:?]
at org.elasticsearch.cluster.coordination.PublicationTransportHandler.lambda$new$0(PublicationTransportHandler.java:112) ~[elasticsearch-8.10.4.jar:?]
at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$1.doRun(SecurityServerTransportInterceptor.java:563) ~[?:?]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) ~[elasticsearch-8.10.4.jar:?]
at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$3.onResponse(SecurityServerTransportInterceptor.java:625) ~[?:?]
at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$3.onResponse(SecurityServerTransportInterceptor.java:614) ~[?:?]
at org.elasticsearch.xpack.security.authz.AuthorizationService.authorizeSystemUser(AuthorizationService.java:668) ~[?:?]
at org.elasticsearch.xpack.security.authz.AuthorizationService.authorize(AuthorizationService.java:310) ~[?:?]
at org.elasticsearch.xpack.security.transport.ServerTransportFilter.lambda$inbound$1(ServerTransportFilter.java:113) ~[?:?]
at org.elasticsearch.action.ActionListenerImplementations$ResponseWrappingActionListener.onResponse(ActionListenerImplementations.java:236) ~[elasticsearch-8.10.4.jar:?]
at org.elasticsearch.action.ActionListenerImplementations$MappedActionListener.onResponse(ActionListenerImplementations.java:95) ~[elasticsearch-8.10.4.jar:?]
at org.elasticsearch.xpack.security.authc.AuthenticatorChain.authenticateAsync(AuthenticatorChain.java:94) ~[?:?]
at org.elasticsearch.xpack.security.authc.AuthenticationService.authenticate(AuthenticationService.java:261) ~[?:?]
at org.elasticsearch.xpack.security.authc.AuthenticationService.authenticate(AuthenticationService.java:199) ~[?:?]
at org.elasticsearch.xpack.security.transport.ServerTransportFilter.authenticate(ServerTransportFilter.java:126) ~[?:?]
at org.elasticsearch.xpack.security.transport.ServerTransportFilter.inbound(ServerTransportFilter.java:104) ~[?:?]
at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler.messageReceived(SecurityServerTransportInterceptor.java:636) ~[?:?]
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:74) ~[elasticsearch-8.10.4.jar:?]
at org.elasticsearch.transport.InboundHandler$1.doRun(InboundHandler.java:294) ~[elasticsearch-8.10.4.jar:?]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:983) ~[elasticsearch-8.10.4.jar:?]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) ~[elasticsearch-8.10.4.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
at java.lang.Thread.run(Thread.java:1583) ~[?:?]
I think we should check the exception type and choose DEBUG
only if it's a org.elasticsearch.transport.ConnectTransportException
or a org.elasticsearch.transport.RemoteTransportException
wrapped around a org.elasticsearch.cluster.coordination.CoordinationStateRejectedException
, and use WARN
otherwise.