Skip to content

MultipartFeature does not support UTF-8 encoding of Content-disposition header #6046

@akbertram

Description

@akbertram

Given an HTML form with utf-8 encoding specified:

<form method="POST" enctype='multipart/form-data' accept-charset="UTF-8">
    <input id="file-input" type="file" name="annex" required>
    <button type="submit">Submit</button>
</form>

Chrome (and other browsers) will send the following request to the server:

Content-type: multipart/form-data; boundary=----WebKitFormBoundaryU6khB3JchXEPIZEA

------WebKitFormBoundaryU6khB3JchXEPIZEA
Content-Disposition: form-data; name="annex"; filename="LETTRE_Transfert de la gestion des données.txt"
Content-Type: text/plain
Bla blah

------WebKitFormBoundaryU6khB3JchXEPIZEA--

However, Jersey 3.11 (and 4.0 from what I can tell), will interpret the file name as ISO8859-1 encoded text.

@POST
@Produces(MediaType.TEXT_HTML)
@Consumes(MediaType.MULTIPART_FORM_DATA)
public Response upload(@FormDataParam("annex") FormDataBodyPart annexFile) throws IOException {
    // The following will fail: text will be "LETTRE_Transfert de la gestion des données.txt"
    assert annexFile.getFormDataContentDisposition().getFileName().equals("LETTRE_Transfert de la gestion des données.txt);
}

This is because the multipart feature relies on the MIMEParser class from the metro-mimepull library which hardcodes the character encoding of headers as ISO8859-1.

Since browser don't actually send include the character set as a parameter of the Content-type header, I think the only solution would be to make this configurable.

If someone on the project can indicate the most appropriate method of configuration, I am happy to submit a pull request to both metro-mimepull and this repo.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions