Skip to content

added location field to DataError#7637

Open
psg-19 wants to merge 1 commit intounicode-org:mainfrom
psg-19:DataError_location
Open

added location field to DataError#7637
psg-19 wants to merge 1 commit intounicode-org:mainfrom
psg-19:DataError_location

Conversation

@psg-19
Copy link
Copy Markdown
Contributor

@psg-19 psg-19 commented Feb 11, 2026

Fixes #4048

Summary :-

  • Added a location field to DataError that stores the caller file, line and column
  • Added #[track_caller] to all error constructors and from impls so Location::caller() captures the real call site
  • Updated display to include the location in the error message

@sffc
Copy link
Copy Markdown
Member

sffc commented Feb 11, 2026

hmm

error: the `Err`-variant returned from this function is very large
   --> components/datetime/src/neo.rs:501:10
    |
501 |     ) -> Result<Self, DateTimeFormatterLoadError>
    |          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
   ::: components/datetime/src/error.rs:16:5
    |
 16 |     Names(PatternLoadError),
    |     ----------------------- the largest variant contains at least 128 bytes
    |

core::panic::Location is 24 bytes, and it's all Copy. Seems fine? But DataError is already 80 bytes, so this seems to push things that contain a DataError over the edge.

DataError doesn't really contain anything super big, just a lot of pointers, and enum discriminants that probably use 8 bytes due to alignment.

@Manishearth
Copy link
Copy Markdown
Member

It is a public field so adding it is a commitment, and we may not be able to feature-gate it.

@sffc
Copy link
Copy Markdown
Member

sffc commented Feb 11, 2026

I do think the lint is telling us something useful (we don't want error types to be so big that they start causing stack frames to not fit in registers or l1 caches, etc), but the problem is not Location, it is the inefficient layout of DataError, which is not something we can easily fix since the type is public and used widely.

So I think my preferred short-term fix is to suppress this clippy lint on the errors it is impacting, and open an issue to fix in 3.0 to reduce the stack size of DataError.

@Manishearth
Copy link
Copy Markdown
Member

Given that the type is non_exhaustive, can we add this as a private field that gets CFG'd into ()? Then only the people with the logging feature enabled pay the cost.

@sffc
Copy link
Copy Markdown
Member

sffc commented Feb 11, 2026

Given that the type is non_exhaustive, can we add this as a private field that gets CFG'd into ()? Then only the people with the logging feature enabled pay the cost.

I don't know what problem that solves?

We could cfg the whole field, but we will still get the clippy warnings for the error being too big since we run clippy with --all-features

@Manishearth
Copy link
Copy Markdown
Member

I don't know what problem that solves?

Users who have not enabled logging

I don't consider the fact that clippy is annoyed to be a "problem", I consider the underlying size bloat caused adding 24 more bytes to an error type to be a bit of a problem, and I would prefer to avoid those costs where possible.

We can add an attach_location(&mut self) method to DataError that would allow people to attach the current Location, and encapsulate the Location in the Debug impl.

@sffc
Copy link
Copy Markdown
Member

sffc commented Feb 12, 2026

I don't really consider increasing DataError from 80 to 104 bytes to be a major issue, since the Location field is probably more important than all the other context information fields it already has. But, if we really want to avoid adding the field, we can just print out the location as we do in .with_debug_context and not persist it in the DataError.

@Manishearth
Copy link
Copy Markdown
Member

Yeah, though one benefit of persisting a location is that it works well with things like forking providers, where errors are returned through the normal course of business.

@robertbastian
Copy link
Copy Markdown
Member

We can add an attach_location(&mut self) method to DataError that would allow people to attach the current Location, and encapsulate the Location in the Debug impl.

the whole point here is this being automatic. people can already attach arbitrarily detailed messages to the errors, but they don't.

I don't really consider increasing DataError from 80 to 104 bytes to be a major issue

On my machine it's currently 96 bytes. PatternLoadError, the other variant in DateTimeFormatterLoadError is 104 bytes, so it's not like we're currently well clear of the (arbitrary) 128 bytes boundary anyway.

@Manishearth
Copy link
Copy Markdown
Member

the whole point here is this being automatic. people can already attach arbitrarily detailed messages to the errors, but they don't.

Yes, we can continue to be automatic, but if people wish to reset the location we have a path for that. I was mostly saying that we can make this private without losing any of the benefits of a public field.

On my machine it's currently 96 bytes. PatternLoadError, the other variant in DateTimeFormatterLoadError is 104 bytes, so it's not like we're currently well clear of the (arbitrary) 128 bytes boundary anyway.

Sure, it's arbitrary, and we should improve this in 3.0, but I want to try and not make the problem worse.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds location tracking to DataError using core::panic::Location to improve debugging ergonomics. Previously, errors only showed where they were unwrapped, requiring developers to chase down error messages. Now, errors capture and display the source location where they were created.

Changes:

  • Added location field to DataError struct to store caller location
  • Added #[track_caller] attribute to all error constructors and From implementations to capture real call sites
  • Updated Display implementation to include location in error messages

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
provider/core/src/error.rs Added location field to DataError, added #[track_caller] to constructors (into_error, custom, for_type) and helper methods (with_marker, with_str_context, with_type_context, with_req), updated Display to show location, added #[track_caller] to From<std::io::Error> impl
provider/core/src/buf/serde.rs Added #[track_caller] to From impls for serde_json::Error, bincode::Error, and postcard::Error

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@sffc sffc added the discuss Discuss at a future ICU4X-SC meeting label Feb 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

discuss Discuss at a future ICU4X-SC meeting

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tracking caller on DataError construction

5 participants