Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(everything): replace Greek 'μ' with micro 'µ' #17331

Merged
merged 5 commits into from
Jan 23, 2025
Merged

fix(everything): replace Greek 'μ' with micro 'µ' #17331

merged 5 commits into from
Jan 23, 2025

Conversation

ddcc4
Copy link
Contributor

@ddcc4 ddcc4 commented Jan 22, 2025

Overview

There are two different characters that look like the "mu" symbol:

  • µ (U+00B5 MICRO SIGN)
  • μ (U+03BC GREEK SMALL LETTER MU)

In some fonts they look different; but in many fonts they look the same.

The problem is that in the Public Sans font that we use for Protocol Designer, U+03BC GREEK SMALL LETTER MU is not available at all, because the font only supports Latin characters plus some scientific symbols. So whenever we use U+03BC GREEK SMALL LETTER MU in text, the browser has to render it with a fallback font, which can look ugly.

For consistency, we should just use U+00B5 MICRO SIGN whenever we have a µ that means 1/1,000,000, and only use U+03BC GREEK SMALL LETTER MU when we're writing Greek text (like in a Greek-language user manual).

image

Test Plan and Hands on Testing

I did a global search and replace on all the text in our codebase. I'm relying on the CI tests to catch any issues.

Risk assessment

This could potentially break external dependencies that expect the Greek-text mu.

@ddcc4 ddcc4 requested review from a team as code owners January 22, 2025 21:27
@ddcc4 ddcc4 requested review from ncdiehl11 and removed request for a team January 22, 2025 21:27
@jwwojak
Copy link
Contributor

jwwojak commented Jan 22, 2025

I think microliter should be abbreviated as "µL" not "µl" (use capital L). I know our API docs are inconsistent here. All our text and online manuals use µL.

@ddcc4
Copy link
Contributor Author

ddcc4 commented Jan 22, 2025

I think microliter should be abbreviated as "µL" not "µl" (use capital L). I know our API docs are inconsistent here. All our text and online manuals use µL.

Agreed! µL is the correct SI abbreviation. But I want PR's to only do one thing at a time, so editorial changes like µl -> µL should go into a separate PR.

@ddcc4 ddcc4 requested a review from koji January 22, 2025 21:52
@mjhuff
Copy link
Contributor

mjhuff commented Jan 22, 2025

Apologies for the failing components CI, I'll put a fix up!

EDIT: Fix merged into edge.

Copy link
Contributor

@SyntaxColoring SyntaxColoring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes requested for some stray files, but otherwise looks good.

I'll have spotty availability, so after fixing the strays, feel free to dismiss this change request and merge without waiting for an approval from me.


Separately to those change requests:

This seems like the right direction to normalize in if we need to normalize, but I'm a little skeptical that it's a sufficient solution to the problem.

Like, custom labware definitions can have arbitrary user-defined text in their display names, and protocols can have arbitrary user-defined text in comments. Who's to say that they won't contain U+03BC instead of U+00B5? Also, what's to stop us from adding more U+03BCs in the future?

Instead of, or in addition to all of this, can we (technically and legally) modify Public Sans to have an entry for U+03BC?

Copy link

codecov bot commented Jan 23, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 73.84%. Comparing base (0e7b516) to head (b9a0767).
Report is 16 commits behind head on edge.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             edge   #17331   +/-   ##
=======================================
  Coverage   73.84%   73.84%           
=======================================
  Files          43       43           
  Lines        3304     3304           
=======================================
  Hits         2440     2440           
  Misses        864      864           
Flag Coverage Δ
shared-data 73.84% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Copy link
Contributor Author

@ddcc4 ddcc4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like, custom labware definitions can have arbitrary user-defined text in their display names, and protocols can have arbitrary user-defined text in comments. Who's to say that they won't contain U+03BC instead of U+00B5?

Excellent point. Users can name their labware whatever they want, including, say, Japanese characters or cuneiform -- they would just get ugly font-substitution rendering if their browser supports it, or the unrenderable glyph marker if their browser doesn't. But we should try to use the correct character in any text WE publish.

Also, what's to stop us from adding more U+03BCs in the future?

We should know better :) But more seriously, there are tons of characters that look similar/the same, and it's always the writer's responsibility to pick the correct one. Like, if you were writing English text and you used the Cyrillic а instead of the Latin a, that's just a mistake.

Instead of, or in addition to all of this, can we (technically and legally) modify Public Sans to have an entry for U+03BC?

Hm. The more common way to solve this is that the branding/design team for the company picks a font that has the character repertoire they need for their intended market. So for example, if your market is the US and Turkey, you would pick a font that has both English and Turkish letters, rather than trying to modify an English font to add the few Turkish letters that are missing.

But that doesn't matter in our case. U+00B5 is the correct letter for "micro" and U+03BC is the wrong one, so we should just use the correct letter. It's similar to if you have English text that mistakenly used the Cyrillic а and noticed that it's not rendering correctly: you should just change it to the Latin a rather than trying to add the Cyrillic а to your font file.

@ddcc4
Copy link
Contributor Author

ddcc4 commented Jan 23, 2025

Also, what's to stop us from adding more U+03BCs in the future?

FWIW, before this change, we had 609 text files that were using the correct micro µ and 66 text files that were using the Greek μ, so the vast majority of the time, we were using the correct one.

Copy link
Contributor

@SyntaxColoring SyntaxColoring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK!

U+00B5 is the correct letter for "micro" and U+03BC is the wrong one

I don't think it's that cut-and-dried, unfortunately. See the thoughts in #5721 and the link to the Unicode technical report.

@ddcc4
Copy link
Contributor Author

ddcc4 commented Jan 23, 2025

I don't think it's that cut-and-dried, unfortunately. See the thoughts in #5721 and the link to the Unicode technical report.

Yes, I am aware of that TR -- it's more aspirational rather than describing how text and fonts handle the characters:

Unicode specifies the decomposition U+00B5 MICRO SIGN -> U+03BC GREEK SMALL LETTER MU, similar to how it defines the decomposition U+FF21 FULLWIDTH LATIN CAPITAL LETTER A -> U+0041 LATIN CAPITAL LETTER A. But the two characters definitely look different ("A" vs "A"), and you can't just substitute one for the other just because Unicode says they semantically mean the same thing.

I'd also point out that, outside of the Unicode report, standards like HTML define &micro; = U+00B5 MICRO SIGN and &mu; = U+03BC GREEK SMALL LETTER MU. So if you're writing strings to be displayed in a browser, you'd want to choose the character that means "micro."

@ddcc4 ddcc4 merged commit 558eca0 into edge Jan 23, 2025
65 checks passed
@ddcc4 ddcc4 deleted the dc-muuuuuu branch January 23, 2025 22:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants