-
Notifications
You must be signed in to change notification settings - Fork 16
Open
Description
Description
I'm trying to load data like this into a TIMESTAMP column:
2024-09-01 07:00:00
2024-09-01 07:00:00
2024-09-01 07:00:00
2024-09-01 07:00:00
2024-09-01 07:00:00
2024-09-01 07:00:00
2024-09-01 07:00:00
This is a common use case, because when you unload data from Redshift, sometimes it automatically truncates timestamps to remove fractions of seconds.
Reproduce
Create a table with a TIMESTAMP column and try load timestamp without milliseconds with a copy command
// Prepare the COPY command
val bucket = "<your bucket name>"
val accessKeyId = "<your access key id>"
val secretAccessKey = "<your secret key>"
val sessionToken = "your session token"
val dataset = "a dataset folder inside the bucket"
stmt.execute(
s"""
|CREATE TABLE $dataset (
| my_timestamp TIMESTAMP
|)
|""".stripMargin
)
val copySql =
s"""
|COPY $dataset (start_time) FROM 's3://$bucket/$dataset/manifest'
|ACCESS_KEY_ID '$accessKeyId'
|SECRET_ACCESS_KEY '$secretAccessKey'
|SESSION_TOKEN '$sessionToken'
|TIMEFORMAT 'YYYY-MM-DD HH24:MI:SS'
|MANIFEST GZIP
|""".stripMarginIt will raise a ParseException even if you declare TIMEFORMAT explictly.
Error Logs
java.text.ParseException: Unparseable date: "2024-09-01 07:00:00"
at java.base/java.text.DateFormat.parse(DateFormat.java:399)
at jp.ne.opt.redshiftfake.RedshiftTimeFormat$FixedTimeFormat.parseSqlTimestamp(RedshiftDateFormat.scala:56)
at jp.ne.opt.redshiftfake.ParameterBinder$Timestamp.bind(ParameterBinder.scala:90)
at jp.ne.opt.redshiftfake.CopyInterceptor.$anonfun$executeCopy$5(Interceptor.scala:38)
at jp.ne.opt.redshiftfake.CopyInterceptor.$anonfun$executeCopy$5$adapted(Interceptor.scala:36)
at scala.collection.immutable.List.foreach(List.scala:431)
at jp.ne.opt.redshiftfake.CopyInterceptor.$anonfun$executeCopy$4(Interceptor.scala:36)
at jp.ne.opt.redshiftfake.CopyInterceptor.$anonfun$executeCopy$4$adapted(Interceptor.scala:35)
at jp.ne.opt.redshiftfake.util.Loan$.using(Loan.scala:13)
at jp.ne.opt.redshiftfake.CopyInterceptor.$anonfun$executeCopy$3(Interceptor.scala:35)
at jp.ne.opt.redshiftfake.CopyInterceptor.$anonfun$executeCopy$3$adapted(Interceptor.scala:31)
at scala.collection.Iterator.foreach(Iterator.scala:943)
at scala.collection.Iterator.foreach$(Iterator.scala:943)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
at scala.collection.IterableLike.foreach(IterableLike.scala:74)
at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
at jp.ne.opt.redshiftfake.CopyInterceptor.executeCopy(Interceptor.scala:31)
at jp.ne.opt.redshiftfake.CopyInterceptor.executeCopy$(Interceptor.scala:23)
at jp.ne.opt.redshiftfake.FakeStatement.executeCopy(FakeStatement.scala:22)
at jp.ne.opt.redshiftfake.FakeStatement.$anonfun$switchExecute$1(FakeStatement.scala:121)
at scala.Option.map(Option.scala:230)
at jp.ne.opt.redshiftfake.FakeStatement.switchExecute(FakeStatement.scala:120)
at jp.ne.opt.redshiftfake.FakeStatement.execute(FakeStatement.scala:72)
at RedshiftTest$.main(RedshiftTest.scala:57)
at RedshiftTest.main(RedshiftTest.scala)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at sbt.Run.invokeMain(Run.scala:144)
at sbt.Run.execute$1(Run.scala:94)
at sbt.Run.$anonfun$runWithLoader$5(Run.scala:121)
at sbt.Run$.executeSuccess(Run.scala:187)
at sbt.Run.runWithLoader(Run.scala:121)
at sbt.Defaults$.$anonfun$bgRunTask$6(Defaults.scala:2030)
at sbt.Defaults$.$anonfun$termWrapper$2(Defaults.scala:1969)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at scala.util.Try$.apply(Try.scala:213)
at sbt.internal.BackgroundThreadPool$BackgroundRunnable.run(DefaultBackgroundJobService.scala:367)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1583)
Current Workaround
I found two workarounds for that:
- Use
TIMEFORMAT 'auto' - Add milliseconds to the data and copy as normal
Probable root cause
I think the issue comes from two lines in RedshiftDateFormat.scala
First in RedshiftDateFormat.forType on line 65
case TimeFormatType.Default => FixedTimeFormat("yyyy-MM-dd HH:mm:ss.SSS")This uses milliseconds by default.
Then on ParseUtil.toStandardDatePart:
def toStandardDatePart(pattern: String): String = {
val partMappings = Seq(
("YYYY", "yyyy"),
("YY", "yy"),
("MM", "MM"),
("MON", "MMMM"),
("DD", "dd"),
("HH24", "HH"),
("HH12", "hh"),
("HH", "HH"),
("MI", "mm"),
("SS", "ss.SSS"), // Here
("OF", "'Z'")
)
def replaceLoop(current: String, mappings: Seq[(String, String)]): String = mappings match {
case (before, after) +: rest => replaceLoop(current.replace(before, after), rest)
case _ => current
}
replaceLoop(pattern, partMappings)
}This makes so it uses milliseconds even if the COPY command explicitly declare SS
Metadata
Metadata
Assignees
Labels
No labels