Skip to content

Design document for percent formatting #1068

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
229 changes: 229 additions & 0 deletions exploration/percent-format.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,229 @@
# Formatting Percent Values

Status: **Proposed**

<details>
<summary>Metadata</summary>
<dl>
<dt>Contributors</dt>
<dd>@aphillips</dd>
<dt>First proposed</dt>
<dd>2025-04-07</dd>
<dt>Pull Requests</dt>
<dd>#1068</dd>
</dl>
</details>

## Objective

_What is this proposal trying to achieve?_

One the capabilities present in ICU MessageFormat is the ability to format a number as a percentage.
This design enumerates the approaches considered for adding this ability as a _default function_
in Unicode MessageFormat.

## Background

_What context is helpful to understand this proposal?_

> [!NOTE]
> This design is an outgrowth of discussions in #956 and various teleconferences.

Developers and translators often need to insert a numeric value into a formatted message as a percentage.
The format of a percentage can vary by locale including
the symbol used,
the presence or absence of spaces,
the shaping of digits,
the position of the symbol,
and other variations.

One of the key problems is whether the value should be "scaled".
That is, does the value `0.5` format as `50%` or `0.5%`?
Developers need to know which behavior will occur so that they can adjust the value passed appropriately.

> [!NOTE]
> In ICU4J:
> - MessageFormat (MF1) scales.
> - MeasureFormat does not scale.
>
> In JavaScript:
> - `Intl.NumberFormat(locale, { style: 'percent' })` scales
> - `Intl.NumberFormat(locale, { style: 'unit', unit: 'percent' })` does not scale

It is also possible for Unicode MessageFormat to provide support for scaling in the message itself,
perhaps by extending the `:math` function.

An addition concern is whether to add a dedicated `:percent` function,
use one of the existing number-formatting functions `:number` and `:integer` with an option `type=percent`,
or use the proposed _optional_ function `:unit` with an option `unit=percent`.
Combinations of these approached might also be used.

## Use-Cases

_What use-cases do we see? Ideally, quote concrete examples._

Developers wish to write messages that format a numeric value as a percentage in a locale-sensitive manner.

The numeric value is not scaled because it is the result of a computation, e.g. `var savings = discount / price`.

The numeric value is scaled, e.g. `var savingsPercent = 50`

Users need control over most formatting details, identical to general number formatting:
- negative number sign display
- digit shaping
- minimum number of fractional digits
- maximum number of fractional digits
- minimum number of decimal digits
- group used (for very large percentages, i.e. > 999%)
- etc.

## Requirements

_What properties does the solution have to manifest to enable the use-cases above?_



## Constraints

_What prior decisions and existing conditions limit the possible design?_

## Proposed Design

_Describe the proposed solution. Consider syntax, formatting, errors, registry, tooling, interchange._

- Use a dedicated function `:percent` that scales by default.
- Provide an option `scaling` with values `true` and `false` and defaulting to `true`.
- Provide all options identical to `:number` _except_ that `select` does not provide `ordinal` value.
- Allow `unit=percent` in `:unit` that is identical to `:percent` in formatting performance,
for compatibility with CLDR units,
but document that this usage is not preferred.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this mean that :unit unit=percent would or would not apply scaling? And why do we need or benefit from compatibility with CLDR units here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need or benefit from compatibility with CLDR units here?

We don't have to be compatible, except that currently the definition of the unit option values is completely delegated to the unit identifiers found here in TR35. It would be unfortunate to say "unit identifiers except this one specific one"

Would this mean that :unit unit=percent would or would not apply scaling?

That's a good question. In the most recent WG discussion, there was a sentiment that we should make them behave identically to avoid confusion. There's an equal sentiment that they should be opposite each other (for utility). Here I'm trying to express equivalent performance without binding to a specific scaling/not-scaling choice (since that is separate).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, my agreement with the proposal is dependent on :unit unit=percent not scaling. So, I see specifying that here one way or another as important.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I see specifying that here one way or another as important.

👍 It is imperative that we specify one or the other.

Currently, my agreement with the proposal is dependent on :unit unit=percent not scaling.

The proposal is to make :percent and :unit unit=percent perform identically, so both would scale by default. Is your opposition to :unit scaling so that message writers could get access to both behaviors without having to use a scale option? Articulating your reasoning will help me improve the design doc to include that as a design we considered (and perhaps sway consensus).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a misunderstanding of unit scaling.

For unit formatting, CLDR has both an input unit and an output unit, where the output unit typically depends on the unit preferences. For example, <3.5, meter> input with foot output formats as "11.5 feet" (in English). There is scaling involved, in the conversion of 3.5 to 11.5. If there is no specified output unit, or the output unit is explicitly the same as the input unit, then there is no scaling. Thus:

<3.5, meter> input with meter output doesn't scale.

  • If I supply <0.35 percent> as the input and the output unit were percent, it would format as 0.35%. Just like meter ==> meter doesn't scale.

However, if I supply the right input unit, then percent does scale (just like meter ==> foot). And the base unit is for such dimensionless units is 'part'.

With <0.35 part> as the input and the output unit of percent, the format is "35%".

Here are sample conversions that I just generated (no formatting)

0.35	part	0.35	part
0.35	part	35.0	percent
0.35	part	350.0	permille
0.35	part	3500.0	permyriad
0.35	part	350000.0	part-per-1e6
0.35	part	3.5E8	part-per-1e9

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understood that to be the case.

:unit can override the unit, in which case scaling occurs. The question is what happens when there is no other unit? Using MeasureFormat in ICU4J can only be an approximation, since the only way to call it is with a Measure object. Presumably a bare number operand in MF would, behind the scenes, be packaged with the unit.

I'm not suggesting that :unit does not convert. Only that the default behavior of unit=percent is unscaled given a numeric operand. This is different from MF1's handling of operand,number,percent formatting and the proposed performance of :percent. Do you disagree?

Copy link
Contributor

@bearfriend bearfriend Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is your opposition to :unit scaling so that message writers could get access to both behaviors without having to use a scale option?

Sorry I wasn't clear on that. Yes, I want them to act differently, so I guess that just isn't this proposal but an alternative. The reason, though, is not just to have access to both behaviors (though it's an excellent side benefit) but because it makes semantic/intuitive sense to me.

I see them as for different purposes, where the input value to :percent is for (or from) some computation which results in a ~number, and the input to :unit would be roughly a string (or semantically equivalent in its static intent, if that makes sense).

1/10 = .1 -> format via :percent -> "You've completed 10% of your tasks"
vs
A user inputs into a marketing tool a discount value of "10" and selects "%" (as opposed to "$", "lbs", "items" etc.), and that uses :unit to render things like "10% off", "$10 off", "Get 10 lbs free...", "Buy 10 get 1 free" or similar.

This is how I, presumptuously, think most people would expect each to work. Happy to be wrong about that, though.

I'm not terribly familiar with the input -> output scaling Mark mentioned, so I'll try to digest that a bit more and see if it changes my perspective. It doesn't initially seem problematic, though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bearfriend I think those are both great use cases and will add them to the document.

Note that the "proposed solution" is a strawman. The alternatives considered are what is important. We'll see if a consensus emerges--or vote on which technical decisions to make.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@macchiati

There is a misunderstanding of unit scaling.

I added your example (suitably edited and expanded) at length to the document. Check for veracity.


## Alternatives Considered

_What other solutions are available?_
_How do they compare against the requirements?_
_What other properties they have?_

### Combinations of Functions and Scaling

Any proposed design needs to choose one or more functions
each of which has a scaling approach
or a combination of both.
It is possible to have separate functions, one that is scaling and one that is non-scaling.
However, the working group suspects that this would represent a hazard,
since users would be forced to look up which one what which behavior.

### Function Alternatives

#### Use `:unit`

Leverage the `:unit` function by using the existing unit option value `percent`.
The ICU implementation of `MeasureFormat` does **_not_** scale the percentage,
although this does not have to be the default behavior of UMF's percent unit format.

```
You saved {$savings :unit unit=percent} on your order today!
```

**Pros**
- Uses an existing set of functionality
- Removes percentages from `:number` and `:integer`, making those functions more "pure"

**Cons**
- `:unit` won't be REQUIRED, so percentage format will not be guaranteed across implementations.
Requiring `:unit type=percent` would be complicated at best.
- More verbose placeholder
- Could require a scaling mechanism

#### Use `:number`/`:integer` with `type=percent`

Use the existing functions for number formatting with a separate `style` option for `percent`.
(This was previously the design)

```
You saved {$savings :number style=percent} on your order today!
```

**Pros**
- Consistent with ICU MessageFormat

Copy link
Member

@macchiati macchiati Apr 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Widely view as a number format option in spreadsheets and other contexts, so many people are familiar with it as a type of number format.
- Consistent with compact number formats, which _also_ scale; eg "3.5 M" for 3500000.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps debatable?

It's certainly proximate to numeric formats, at least in some spreadsheets. FWIW, we do group it into the number functions and it certainly takes a numeric operand. But I think a case can be made that :number type=percent or :percent are both intuitive--and the latter becomes maybe a bit more obvious given :currency.

The meta debate we're having is a classic in the I18N space: split or lump? Should we prefer functions that do many things with lots of options? Or should we prefer functions that do roughly one thing with minimal options (and lots and lots of functions)?


Note that the "123" button is "More Formats" in Google sheets:

image

Excel puts percent after date/time:

image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A fair number of the other Pros/Cons are debatable... But I'll tweak my suggested change.

I'm not wild about :percent as a separate function; nor wild about :scientific or :engineering or :compact or even :integer. Just the sheer volume of duplicated options gets to be very daunting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the sheer volume of duplicated options gets to be very daunting.

The worst of all worlds would be lots of functions each of which has lots of options and where some functions are general purpose and overlap with special purpose ones. With support for custom functions, that will sometimes be unavoidable. But for the default function set we should have a clear policy/design philosophy. The meta debate is, in many ways, more important, than the concrete decision of what to name the percent formatting function (but percent is as good a trial horse, I think, as we're going to get). Note that the discussion about semantic skeletons also is considering the problem of function packaging.

**Cons**
- It's the only special case remaining in these functions. Why?

#### Use a dedicated `:percent` function

Use a new function `:percent` dedicated to percentages.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider names besides :percent?

The function could apply to all dimensionless units including permille, permillion, perbillion, etc.

For example: {$var :dimensionless unit=permillion}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider other names. I'll add that option.

I'm not wild about unit=percent (mille, billion, etc. etc.). It's verbose and the other uses seems rare. Really only percent and permille are backed by CLDR data. The others strike me as special uses for unit or number formatting.


```
You saved {$savings :percent} on your order today!
```

> [!NOTE]
> @sffc suggested that we should consider other names for `:percent`.
> The name shown here could be considered a placeholder pending other suggestions.

**Pros**
- Least verbose placeholder
- Clear what the placeholder does; self-documenting?

**Cons**
- Adds to a (growing) list of functions
- Not "special enough" to warrant its own formatter?

#### Use a generic scaling function

Use a new function with a more generic name so that it can be used to format other scaled values.
For example, it might use an option `unit` to select `percent`/`permille`/etc.

```
You saved {$savings :dimensionless unit=percent} on your order today!
You saved {$savings :scaled per=100} on your order today!
```

**Pros**
- Could be used to support non-percent/non-permille scales that might exist in other cultures
- Somewhat generic
- Unlike currency or unit values, "per" units do not have to be stored with the value to prevent loss of fidelity,
since the scaling is done to a plain old number.
This would not apply if the values are not scaled.

**Cons**
- Only percent and permille are backed with CLDR data and symbols.
Other scales would impose an implementation burden.
Comment on lines +259 to +260
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLDR has data for other scales, too, via portion units.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not really percent/per mille type scaling though, is it?

- More verbose. Might be harder for users to understand and use.

### Scaling Alternatives

#### No Scaling
User has to scale the number. The value `0.5` formats as `0.5%`

#### Always Scale
Implementation always scales the number. The value `0.5` formats as `50%`

#### Optional Scaling
Implementation automatically does (or does not) scale.
There is an option to switch to the other behavior.
Comment on lines +288 to +289
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Implementation automatically does (or does not) scale.
There is an option to switch to the other behavior.
Formatter automatically does (or does not) scale.
There is an option to switch to the other behavior.
The option here may be:
- An option `scaling` with boolean values `true` and `false`.
- An option `scale` with a small set of supported integer values, possibly only `1` and `100`.


#### Use `:math exp` to scale
Provide functionality to scale numbers with integer powers of 10 using the `:math` function.

Examples using `:unit`, each of which would format as "Completion: 50%.":
```
.local $n = {50}
{{Completion: {$n :unit unit=percent}.}}

.local $n = {0.5 :math exp=2}
{{Completion: {$n :unit unit=percent}.}}
```

#### Use `:math multiply` to scale
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note my concern about implementation burden due to having to support a more general function than we actually need.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The :math function, currently in draft and originally included to support plural offsets from MF1, is certainly a potential "slippery slope".

Many programming languages have math-related classes or function sets with many different operators in them. The existence of a :math function in MF would certainly invite proposals for many of these to migrate into messages, regardless of utility. This in a specification that is strongly **UN**typed.

If we go down the :math route, I would suggest that we write a full design document, including considerations for what our policies would be about future expansion. We should also consider whether math is the right name or different design strategies, such as unbundling functionality into separate functions (is it a better imposition of burden to have separate required :add and :subtract functions than a required function that has addition, subtraction, scaling, etc. into which we might add hard-to-achieve functionality? There is also the question of versioning the :math function if we add new operations to it over time, creating a portability hazard)

Provide arbitrary integer multiplication functionality using the `:math` function.

Examples using `:unit`, each of which would format as "Completion: 50%.":
```
.local $n = {50}
{{Completion: {$n :unit unit=percent}.}}

.local $n = {0.5 :math multiply=100}
{{Completion: {$n :unit unit=percent}.}}
```