Skip to content

MultipartReader performs badly for large reads #8664

Open
@yschimke

Description

@yschimke
@Test
  public void testParserBenchmarkOkHttp() throws ParsingException, IOException {
    String boundary = "foo";
    RequestBody multipartBody =
        new MultipartBuilder(boundary)
            .addPart(
                Headers.of("header-name", "header-value"),
                new RequestBody() {
                  @Override
                  public MediaType contentType() {
                    return MediaType.parse("application/octet-stream");
                  }

                  @Override
                  public long contentLength() throws IOException {
                    return 1024 * 1024 * 100;
                  }

                  @Override
                  @SuppressWarnings("KotlinInternal")
                  public void writeTo(BufferedSink sink) throws IOException {
                    Thread thread =
                        new Thread(
                            () -> {
                              for (int i = 1; i <= 100; i++) {
                                Buffer buffer = (Buffer) sink;
                                if (buffer.size() > 4 * 1024 * 1024) {
                                  Log.i("yschimke", "Sink has more than 4MB of data, waiting");
                                  while (buffer.size() > 4 * 1024 * 1024) {}
                                }
                                Log.i("yschimke", "Writing 1MB string");
                                buffer.writeUtf8(Strings.repeat("a", 1024 * 1024));
                                Log.i("yschimke", String.valueOf(buffer.size()));
                              }
                            });
                    thread.start();
                    try {
                      thread.join();
                    } catch (InterruptedException e) {
                      throw new RuntimeException(e);
                    }
                  }
                })
            .build();
    Buffer buffer = new Buffer();
    new Thread(
            () -> {
              try {
                multipartBody.writeTo(buffer);
              } catch (IOException e) {
                throw new RuntimeException(e);
              }
            })
        .start();

    MultipartReader multipartReader = new MultipartReader(buffer, boundary);
    while (true) {
      MultipartReader.Part part = multipartReader.nextPart();

      if (part == null) break;

      assertThat(part.headers().get("header-name")).isEqualTo("header-value");
      while (true) {
        Buffer readBuff = new Buffer();
        long red = part.body().read(readBuff, 1024 * 1024);
        if (red == -1) {
          Log.i("yschimke", "no more data to read");
          break;
        } else {
          Log.i("yschimke", "read " + red + " leftover = " + buffer.size());
          assertThat(readBuff.readUtf8()).isEqualTo(Strings.repeat("a", (int) red));
        }
      }
    }

Specifically

        long red = part.body().read(readBuff, 1024 * 1024);

May scan whole 1mb, before only returning 8kb, and then repeat.

Workaround is sticking with @swankjesse's magic number https://publicobject.com/2020/09/14/many-correct-answers/

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugBug in existing code

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions