Skip to content

K8s examples break with long running builds #860

Closed
@aaronmondal

Description

When trying to build LLVM with the K8s setup, the connection drops after ~6000 targets. It appears that this is not recoverable, i.e. rerunning the build doesn't work, likely because the default cas sizes are too small. We should adjust the values to make the setup more robust for a "naive import". Another cause could be the Http2 gateway configuration which we should be switched to grpc anyways.

See:

ERROR: /home/aaron/.cache/bazel/_bazel_aaron/7532c8de286b66ea1aaee4ff51952e7c/external/llvm-project-overlay~~llvm_project_overlay~llvm-project/clang/BUILD.bazel:1003:11: Comp
iling clang/lib/Sema/SemaTemplate.cpp failed: (Exit 34): UNAVAILABLE: RST_STREAM closed stream. HTTP/2 error code: NO_ERROR
java.io.IOException: io.grpc.StatusRuntimeException: UNAVAILABLE: RST_STREAM closed stream. HTTP/2 error code: NO_ERROR
        at com.google.devtools.build.lib.remote.GrpcRemoteExecutor.executeRemotely(GrpcRemoteExecutor.java:241)
        at com.google.devtools.build.lib.remote.RemoteExecutionService.executeRemotely(RemoteExecutionService.java:1516)
        at com.google.devtools.build.lib.remote.RemoteSpawnRunner.lambda$exec$2(RemoteSpawnRunner.java:293)
        at com.google.devtools.build.lib.remote.Retrier.execute(Retrier.java:245)
        at com.google.devtools.build.lib.remote.RemoteRetrier.execute(RemoteRetrier.java:127)
        at com.google.devtools.build.lib.remote.RemoteRetrier.execute(RemoteRetrier.java:116)
        at com.google.devtools.build.lib.remote.RemoteSpawnRunner.exec(RemoteSpawnRunner.java:266)
        at com.google.devtools.build.lib.exec.AbstractSpawnStrategy.exec(AbstractSpawnStrategy.java:159)
        at com.google.devtools.build.lib.exec.AbstractSpawnStrategy.exec(AbstractSpawnStrategy.java:119)
        at com.google.devtools.build.lib.exec.SpawnStrategyResolver.exec(SpawnStrategyResolver.java:45)
        at com.google.devtools.build.lib.rules.cpp.CppCompileAction.execute(CppCompileAction.java:1402)
        at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.executeAction(SkyframeActionExecutor.java:1144)
        at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.run(SkyframeActionExecutor.java:1061)
        at com.google.devtools.build.lib.skyframe.ActionExecutionState.runStateMachine(ActionExecutionState.java:165)
        at com.google.devtools.build.lib.skyframe.ActionExecutionState.getResultOrDependOnFuture(ActionExecutionState.java:94)
        at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor.executeAction(SkyframeActionExecutor.java:558)
        at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.checkCacheAndExecuteIfNeeded(ActionExecutionFunction.java:859)
        at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.computeInternal(ActionExecutionFunction.java:333)
        at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.compute(ActionExecutionFunction.java:171)
        at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:461)
        at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:414)
        at java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
Caused by: io.grpc.StatusRuntimeException: UNAVAILABLE: RST_STREAM closed stream. HTTP/2 error code: NO_ERROR
        at io.grpc.Status.asRuntimeException(Status.java:535)
        at io.grpc.stub.ClientCalls$BlockingResponseStream.hasNext(ClientCalls.java:660)
        at com.google.devtools.build.lib.remote.GrpcRemoteExecutor.lambda$executeRemotely$2(GrpcRemoteExecutor.java:175)
        at com.google.devtools.build.lib.remote.Retrier.execute(Retrier.java:245)
        at com.google.devtools.build.lib.remote.RemoteRetrier.execute(RemoteRetrier.java:127)
        at com.google.devtools.build.lib.remote.RemoteRetrier.execute(RemoteRetrier.java:116)
        at com.google.devtools.build.lib.remote.GrpcRemoteExecutor.lambda$executeRemotely$3(GrpcRemoteExecutor.java:146)
        at com.google.devtools.build.lib.remote.util.Utils.refreshIfUnauthenticated(Utils.java:528)
        at com.google.devtools.build.lib.remote.GrpcRemoteExecutor.executeRemotely(GrpcRemoteExecutor.java:144)
        ... 26 more

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions