Skip to content
This repository was archived by the owner on Jan 22, 2019. It is now read-only.
This repository was archived by the owner on Jan 22, 2019. It is now read-only.

Two doubles quotes in columns causes Unexpected character exception #151

Open
@youribonnaffe

Description

@youribonnaffe

I have a CSV file with the following content (just a limited extract here):

route_id,agency_id,route_short_name,route_long_name,route_desc,route_type,route_url,route_color,route_text_color
OCE669711,OCESN,"",""Cars Réguliers ""L 11""  (Nantes - St Gilles Croix de Vie)"",,3,,,

Parsing this CSV content with CsvMapper causes the following error:

com.fasterxml.jackson.core.JsonParseException: Unexpected character ('C' (code 67)): Expected separator ('"' (code 34)) or end-of-line
 at [Source: java.io.StringReader@279ad2e3; line: 2, column: 23]

	at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1702)
	at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:558)
	at com.fasterxml.jackson.core.base.ParserMinimalBase._reportUnexpectedChar(ParserMinimalBase.java:456)
	at com.fasterxml.jackson.dataformat.csv.CsvParser._reportUnexpectedCsvChar(CsvParser.java:1089)
	at com.fasterxml.jackson.dataformat.csv.impl.CsvDecoder._nextQuotedString(CsvDecoder.java:838)
	at com.fasterxml.jackson.dataformat.csv.impl.CsvDecoder.nextString(CsvDecoder.java:601)
	at com.fasterxml.jackson.dataformat.csv.CsvParser._handleNextEntry(CsvParser.java:678)
	at com.fasterxml.jackson.dataformat.csv.CsvParser.nextFieldName(CsvParser.java:575)
	at com.fasterxml.jackson.databind.deser.std.MapDeserializer._readAndBindStringKeyMap(MapDeserializer.java:505)
	at com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:362)
	at com.fasterxml.jackson.databind.deser.std.MapDeserializer.deserialize(MapDeserializer.java:27)
	at com.fasterxml.jackson.databind.MappingIterator.nextValue(MappingIterator.java:277)
	at com.fasterxml.jackson.databind.MappingIterator.readAll(MappingIterator.java:317)
	at com.fasterxml.jackson.databind.MappingIterator.readAll(MappingIterator.java:303)

Here is a unit test to reproduce the issue:

    @Test
    public void doubleQuotes() throws Exception {
        String content =
                "route_id,agency_id,route_short_name,route_long_name,route_desc,route_type,route_url,route_color,route_text_color\n" +
                        "OCE669711,OCESN,\"\",\"\"Cars Réguliers \"\"L 11\"\"  (Nantes - St Gilles Croix de Vie)\"\",,3,,,";

        CsvSchema schema = CsvSchema.emptySchema().withHeader();
        MappingIterator<Map<String, String>> it = new CsvMapper().readerFor(Map.class)
                .with(schema)
                .readValues(content);

        assertEquals(1, it.readAll().size());
    }

Is there a way to configure the parser to be more flexible about this usage of quotes?
Unfortunately the CSV file is not under my control and I won't be able to change it's format.

Parsing this file with OpenCSV was working but I was hoping to switch to Jackson for better performances.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions