-
Notifications
You must be signed in to change notification settings - Fork 126
Description
Environment
- Deploy method: latest docker image from docker hub running on local
- Olake version: 0.0.17
- OS: for instance Mac OS 15.5
- Cloud Provider: none
- Docker Params (if deployed with docker): a list of env variables with values (please, hide sensitive credentials!), mapped volumes and ports
Description
The Olake sync process is failing with a fatal error when processing records from MongoDB. The error is json: error calling MarshalJSON for type time.Time: Time.MarshalJSON: year outside of range [0,9999]
.
This error occurs when a time.Time
field with a year of 0 is passed to Go's encoding/json
library for marshaling.
Steps to reproduce
Steps to reproduce the behavior:
- Set up a MongoDB instance: Ensure you have access to a MongoDB database (local or remote).
- Insert a document with a malformed timestamp: In a collection within your MongoDB instance, insert a document containing a timestamp field with a year of 0. For example:
Replace
db.your_collection_name.insertOne({ "timestampField": ISODate("00-1-11-30T00:00:00Z") });
your_collection_name
with the actual name of your collection andtimestampField
with the name of your timestamp field. - Configure Olake to sync from this MongoDB collection: Ensure your Olake configuration is set up to read from the MongoDB collection created in the previous step.
- Run the Olake sync process: Execute the command to start the data synchronization with Olake.
- Observe the error: The Olake sync process should terminate with a fatal error similar to:
json: error calling MarshalJSON for type time.Time: Time.MarshalJSON: year outside of range [0,9999]
Expected behavior
The Olake pipeline should gracefully handle timestamps with a year of 0. Instead of failing, the ReformatDate
function should correctly adjust the year (e.g., to 1) to ensure the timestamp can be marshaled to JSON without error. The record should then be processed and sent to the destination (e.g., Iceberg) with the corrected timestamp.
Actual behavior
When the Olake sync process encounters a MongoDB record with a timestamp field where the year is 0 (e.g., ISODate("00-1-11-30T00:00:00Z")
), the pipeline fails with the following error:
json: error calling MarshalJSON for type time.Time: Time.MarshalJSON: year outside of range [0,9999]
This indicates that the ReformatDate
function's current logic does not correctly handle year 0, leading to a marshaling failure.
Root Cause
The root cause is a bug in the year correction logic within the ReformatDate
function in olake/utils/typeutils/reformat.go
.
The problematic code is:
if parsed.Year() < 0 {
parsed = parsed.AddDate(0-parsed.Year(), 0, 0)
}
When parsed.Year()
is 0, the expression 0-parsed.Year()
evaluates to 0
, so no correction is applied. The timestamp with year 0 is then passed to the JSON marshaler, which causes the failure.
The invalid timestamp, ISODate("00-1-11-30T00:00:00Z")
, is being ingested from a MongoDB database.
Proposed Solution
To fix this, the year correction logic in olake/utils/typeutils/reformat.go
needs to be updated to correctly handle year 0. The proposed change is to adjust the year to 1 if it is less than or equal to 0.
if parsed.Year() <= 0 {
parsed = parsed.AddDate(1-parsed.Year(), 0, 0) // This would set year 0 to year 1
} else if parsed.Year() > 9999 {
parsed = parsed.AddDate(-(parsed.Year() - 9999), 0, 0)
}
This change will ensure that timestamps with year 0 are corrected before being passed to the JSON marshaler, preventing the pipeline from failing.