Default to printing floats with decimal instead of scientific notation #22971

MasonRemaley · 2025-02-22T04:35:09Z

Currently, std.fmt defaults to formatting floats with scientific notation. IIUC this was originally due to some limitations in the float formatting code that have since been resolved.

This results in some silly output, such as @as(f32, 1) being formatted as 1e0. Outside of being a bit odd, it makes it easy to misinterpret output if you miss the e at the end of a number. You can just pass d to the formatter when formatting a single number, but this doesn't work if you're formatting e.g. a struct that contains fields with numbers. As such it's worth it to have a good default here.

This PR changes the default to decimal, and updates the corresponding tests.

castholm · 2025-02-22T11:31:27Z

Obviously formatting 1 as 1e0 looks dumb, but have you considered the impact this change will have on very small or very large values? One upside of scientific notation is that there's a reasonable limit to the maximum length of the formatted string. Formatting std.math.floatTrueMin(f64) as a decimal requires 326 characters. Even if most values don't get close to the lower or upper limits, you don't need to get that far away from 0 before the output from printing larger structs/arrays starts to get overwhelmingly noisy and difficult for humans to parse.

Many other programming languages and shells seem to default to using decimal notation when the number is within a relatively small "human friendly" range and scientific notation otherwise. E.g. Python seems to prefer decimal for values > 1e-5 and < 1e+16, and for JavaScript it appears to be > 1e-7 and < 1e+21. Perhaps something like that would be a better default?

mrjbq7 · 2025-02-23T03:54:08Z

It shouldn't require 326 characters, I think, if it uses a smart algorithm like Dragonbox.

https://github.com/jk-jeon/dragonbox

tiehuis · 2025-02-23T05:31:08Z

No, the worst case decimal outputs are going to require a lot of characters regardless of algorithm, as they are generally aiming to have round-trippable output.

mrjbq7 · 2025-02-23T18:37:22Z

Dragonbox has these three properties:

https://github.com/jk-jeon/dragonbox?tab=readme-ov-file#introduction

The algorithm guarantees three things:

It has the roundtrip guarantee; that is, a correct parser interprets the generated output string as the original input floating-point number. (See here for some explanation on this.)

The output is of the shortest length; that is, no other output strings that are interpreted as the input number can contain less number of significand digits than the output of Dragonbox.

The output is correctly rounded: the number generated by Dragonbox is the closest to the actual value of the input number among possible outputs of minimum number of digits.

It is quite worth implementing here instead of decimal output.

tiehuis · 2025-02-23T22:15:50Z

Please read more about the algorithm you are suggesting. You are failing to understand what it is providing and what this MR is attempting to improve. Dragonbox is similar to Ryu (which zig implements) in that they are based on generating a signficand and exponent in shortest-form. The first sentence in your link explicitly specifies its purpose.

Happy to talk about this elsewhere but this is not related to this MR.

mrjbq7 · 2025-02-24T01:25:17Z

You are failing to understand what it is providing and what this MR is attempting to improve.

I re-read the PR, and I see what you mean. Separately, Dragonbox is objectively better than Ryu. We switched to it and have been quite happy in the Factor programming language.

andrewrk · 2025-02-24T01:28:26Z

It's extremely easy to evaluate these things objectively. There's no reason to assert such things without evidence. Please don't assert performance claims without pointing to some reproducible benchmark. It's just noise on the issue tracker. Data points or gtfo

mrjbq7 · 2025-02-24T01:30:24Z

Please don't assert performance claims without pointing to some reproducible benchmark.

Sorry, I'm not asserting performance claims. I'm making an assertion around the satisfying human-readability and correctness of the algorithm. It is also pretty fast.

MasonRemaley · 2025-02-24T02:52:32Z

@tiehuis let me know if you have any thoughts for or against merging this--I'm interested in your take since you provided the ryu implementation.

andrewrk · 2025-02-24T03:42:08Z

you might have missed #22971 (comment)

MasonRemaley · 2025-02-24T04:01:40Z

Ah that's a good point--I'll look into how other languages decide what the cutoff is here.

MasonRemaley added 2 commits February 21, 2025 20:29

Default to printing floats with decimal instead of scientific notation

393dde8

Fixes missed test

3cc5c0f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default to printing floats with decimal instead of scientific notation #22971

Default to printing floats with decimal instead of scientific notation #22971

MasonRemaley commented Feb 22, 2025

castholm commented Feb 22, 2025

mrjbq7 commented Feb 23, 2025

tiehuis commented Feb 23, 2025

mrjbq7 commented Feb 23, 2025

tiehuis commented Feb 23, 2025

mrjbq7 commented Feb 24, 2025

andrewrk commented Feb 24, 2025

mrjbq7 commented Feb 24, 2025

MasonRemaley commented Feb 24, 2025

andrewrk commented Feb 24, 2025

MasonRemaley commented Feb 24, 2025

Default to printing floats with decimal instead of scientific notation #22971

Are you sure you want to change the base?

Default to printing floats with decimal instead of scientific notation #22971

Conversation

MasonRemaley commented Feb 22, 2025

castholm commented Feb 22, 2025

mrjbq7 commented Feb 23, 2025

tiehuis commented Feb 23, 2025

mrjbq7 commented Feb 23, 2025

tiehuis commented Feb 23, 2025

mrjbq7 commented Feb 24, 2025

andrewrk commented Feb 24, 2025

mrjbq7 commented Feb 24, 2025

MasonRemaley commented Feb 24, 2025

andrewrk commented Feb 24, 2025

MasonRemaley commented Feb 24, 2025