Skip to content

Gotchas in using COPY command with JSON #62

@peterlitvak

Description

@peterlitvak

I've compiled a list of gotchas that I've discovered while trying to make COPY command work.

  1. Command keywords must be all capital letters (e.g. COPY, FROM, etc.) while parameter values' casing matches the AWS doc (e.g. aws_access_key_id). Note that AWS allows entire command to be lowercase.
  2. JSON or FORMAT JSON or FORMAT AS JSON must go directly after CREDENTIALS and before any other additional parameters, e.g. TIMEFORMAT, otherwise JSON format is ignored and treated as CSV. This is not the case with AWS, the order is mostly irrelevant.
  3. The UTF-16 encoding is not recognized and not supported.
  4. You have to have your JSON data to be placed in the file as one JSON object per line, e.g.:
{"key":val}
{"key":val}

you cannot use, supported by AWS, pretty printed form of JSON as in:

{
  "key":val
}
{ 
  "key":val
}
  1. The JSONPaths file must be written all in one line as:
{ "jsonpaths": [ "$['p1']", "$['p2']", ... ]}
  1. You cannot have your data and your jsonpaths file located under the same prefix of the S3 path, e.g.:
s3://some/table/data/file.json
s3://some/table/data/file.paths.json

This will read your jsonpaths as data.

I hope it will help folks using this JDBC driver as well as the author, to make some improvements.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions