Msa #50

vincerubinetti · 2025-03-06T02:28:07Z

switch to html-to-image (better maintained fork of dom-to-image than dom-to-image-more)
make basic MSA component
- takes multiple string sequences, with labels
- assigns characters in sequence a particular color based on a "type" (provided by consumer or auto-assigned by uniq character)
- combined summary row at top that shows percentage breakdown of chars in column
- ability to wrap to new rows at N number of chars
- print and raw export options
debounce network component resizing that was slightly slowing down page
when clicking number box inc/dec buttons, keep button under mouse
split up testbed page code. split each section into its own component, making it easier to re-arrange and comment/uncomment them. split fake data generation functions into separate file.
change color palette yet again. go back to hand-picked tailwind colors.
tweak some util funcs

I'll shortly also be adding JPEG/PNG export to the MSA viz as well. I won't be adding SVG export, because the way it's currently constructed using several separate SVGs. I could reconstruct the whole viz as one large SVG, which would have the following tradeoffs: allow single SVG download, more straightforward but much more verbose positioning of elements, less flexible and harder to make changes to layout, not easily made responsive to screen size width.

netlify · 2025-03-06T02:29:11Z

✅ Deploy Preview for molevolvr ready!

Name	Link
🔨 Latest commit	`2823b68`
🔍 Latest deploy log	https://app.netlify.com/sites/molevolvr/deploys/67feea60ed317800086c80c9
😎 Deploy Preview	https://deploy-preview-50--molevolvr.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

falquaddoomi

The graphic looks great IMO! I tried out the following features:

Keyboard navigation: I can tab through each panel and scroll them with the arrow keys as expected.
Wrapping: seems to work as expected, with additional rows being added as the wrap width is decreased. The window does indeed remain scrolled so that the mouse is under the button you're clicking, even as the height of the page is changing, which is pretty cool. 🙂
Width Toggle: works as expected, and IMO a nice quality-of-life feature.
Exports:
- the PNG and JPEG exports appear to export the image as it appears in the page. One thing I noticed is that the PNG and JPEG previews appear to show the current view, but the panel in the resulting image is scrolled to 0, rather than showing the portion of the panel where the user had scrolled. (More thoughts in [1].)
- The TSV export works well, although if it's not a standard format you might consider adding a header column.
- The PDF export works, and IMO the result looks great.

[1] I don't know how big of a problem this is, if at all; I also don't know if there's a way to get the package you're using to manipulate the scroll position in the exported image. If there isn't, you could consider setting the wrap prior to export to a value for which the contents of the panel won't overflow and require scrolling.

vincerubinetti · 2025-03-12T19:05:44Z

although if it's not a standard format you might consider adding a header column.

I think in this case column headers aren't super useful? After the first column, it'd just be the sequence position. I guess I'd leave that call up to Janani & Co. It would just look something like:

Name,1,2,3,4,5,6,7,8,9,10,
Firmucutes A,A,G,T,C,C,T,A,T,G,
Firmucutes B,A,G,T,C,C,T,A,T,G,

Maybe it could be more useful if we include the top "combined" row to the tsv export. Not sure how to format that... just choose the most common character in each column? show all the %s like A 70%\nG 30% (or comma separated).

vincerubinetti · 2025-03-12T19:15:40Z

Regarding the jpg/png export issue:

I didn't notice that scroll issue. I would've assumed that the library would preserve whatever was on screen. It looks like this is a known issue: https://github.com/bubkoo/html-to-image/issues?q=is%3Aissue%20state%3Aopen%20scroll%20position I'm not sure the library could even fix the issue though, as I think it would lie in the browser's internal painting of the elements to canvas. One workaround is instead of centering a particular element in the view using the browser's normal overflow, you'd (temporarily) hard code a fixed offset position of everything in the container with CSS, as described in some of those issues. That's very dirty though and I'd like to avoid it.

Also, my thought is that if a user wants something to be visible in the png and were going to manually scroll to it anyway, they could also manually set the wrap point to ensure it's visible without a scrollbar at all.

Another option is to temporarily automatically set the wrap point ourselves right before the user exports as png, then reset it afterward. This is what I'm doing with the print option. But in the case of printing, we can usually expect the screen size to be a standard US letter, pretty wide. But with the png, it depends on the user's window width. I'll try experimenting with this. An exact calculation of the wrap point would be complex, but I could either loosely base it on window width, or iteratively reduce it until there's no scrollbars.

vincerubinetti · 2025-03-12T20:53:06Z

could either loosely base it on window width, or iteratively reduce it until there's no scrollbars.

The iterative can work, but causes a visible lag because you need to decrease the wrap, wait for it to render, check if a scrollbar is there, decrease again, etc. The rough tuning seems to work well enough, though when CSS on the page changes (widths of stuff) in the future, a maintainer will have to remember to re-tune it.

vincerubinetti · 2025-03-12T21:45:28Z

@falquaddoomi good catch, the latest commit should improve the spacing there (but still not perfect because it's just a rough tuning).

epbrenner · 2025-03-12T21:50:16Z

if the version without the header is an already-accepted format

I have no idea about this. @jananiravi @epbrenner, is there a particular format that's most standard or helpful when downloading as tsv? See my comment above

People going to download MSAs are usually expecting one of a few possible plaintext formats. These are usually Clustal, FASTA/Pearson, and sometimes NEXUS. If somebody wants to download the MSA not in a graphical format, they'll probably want it in a format like that so they can use it in other software downstream (like building their own custom phylogenetic trees).

The raw input MSA that goes into this is the simplest plaintext output to provide a user, but if they want the Combined consensus sequence, that's something not present in that original. Maybe the options for plaintext/tsv download could be "Original MSA" in the original upstream format MolEvolvR produces before visualization, and then "Consensus sequence" just as a FASTA file like:

>ConsensusSeq_$QueryName_$FiltersApplied
MWYFAWILTSTLWARE[...]

And this separate file would contain only the combined consensus sequence as a regular FASTA format sequence and no other sequences. At least that's what I think may be the most broadly applicable, but I'd like thoughts from @jananiravi too!

falquaddoomi · 2025-03-12T22:03:27Z

@falquaddoomi good catch, the latest commit should improve the spacing there (but still not perfect because it's just a rough tuning).

@vincerubinetti: Your latest change seems to have fixed it for me!

I haven't noticed much roughness myself, but IMO whoever's downloading the PNG/JPEG format can edit to their liking, if their intent is to use it for a publication or something. If they want a more predictable output format they can download it as a PDF, and for a more flexible format there's the TSV.

vincerubinetti · 2025-03-14T01:28:10Z

@falquaddoomi After you approved, I noticed some critical bugs I wanted to include, and also ended up including a legend and the clustal coloring scheme that was requested. I'm also still waiting to hear back about what TSV download format is desired, which I'll also add to this PR.

@epbrenner Could you take a look at the code in msa-clustal.ts to verify that I've interpreted the table correctly? And please spot check that the coloring is right on the testbed, where I've hooked up the clustal theme by default.

jananiravi · 2025-04-07T19:54:19Z

I think in this case column headers aren't super useful? After the first column, it'd just be the sequence position. I guess I'd leave that call up to Janani & Co. It would just look something like:
Name,1,2,3,4,5,6,7,8,9,10,
Firmucutes A,A,G,T,C,C,T,A,T,G,
Firmucutes B,A,G,T,C,C,T,A,T,G,
Maybe it could be more useful if we include the top "combined" row to the tsv export. Not sure how to format that... just choose the most common character in each column? show all the %s like A 70%\nG 30% (or comma separated).

The MSA would be in a fasta format -- it will work later w/ any of these tools: https://www.ebi.ac.uk/jdispatcher/msa

Reg. @falquaddoomi's comment:

Well, I was just thinking it'd be "Name\tSequence", just so it's identified somehow; no need to create columns per position in my opinion. That said, if the version without the header is an already-accepted format, let's stick with that.

I agree that for a simple tsv, this should suffice
Name\tAccNum\tSequence

Reg. @epbrenner's comment:
yes, the most common use case is exporting the MSA in FASTA format so they can use it with a different tool downstream.

Was the MSA coloring part of a different PR or resolved comments above? I know Evan had some thoughts on those -- can't find that anymore.

vincerubinetti · 2025-04-07T20:36:08Z

Was the MSA coloring part of a different PR or resolved comments above? I know Evan had some thoughts on those -- can't find that anymore.

If you're referring to the clustal coloring, that is implemented and in the preview.

I agree that for a simple tsv, this should suffice
Name\tAccNum\tSequence

I've just made changes to download in this format. You can put label/name, sequence, and any additional field (not just accnumber) as a column in the download.

falquaddoomi · 2025-04-07T20:55:26Z

Hey @vincerubinetti, as @jananiravi mentioned above, in our prior meeting @epbrenner mentioned that having FASTA as an additional download type could be useful for downstream analyses. Sorry that I'm just mentioning this now, and hopefully it's not too hard to add.

FASTA is a text format in which each sequence is preceded by a "header" line. The header starts with "> " and then lines that follow it are the sequence for that header. Multiple sequences can be included in one file, each initiated by a header line. AFAIK it's common for a FASTA header to start with the accession number, but I'm unaware of specifics beyond that; maybe @jananiravi and @epbrenner can provide some advice.

In our case, it might look like

> esse velit nulla deserunt eu ut eu laborum nostrud
APLTMWOCAIUOTATCIRDOUEXEMFZRKDNKOLPGEYMBLSGNTIKQPFIADONYWUHFXJKQOZHXMDJVPTJZCKYW
PICOMLUFMBNJATTRIHMTRCWXSLBUUYFQFQOKXPVKOPQKALGWRFOFOFQWKJEHXIGENMNAIXYQAVELZXDU
WDZLTKXINRUNDVCBMXNAUIVGFVFZCHTSKSFVTQWUUMSKNPXUSSKLHOXYIZCTXOTZRJKVINROJKQDCOWQ
TPLBVOZOIRJJJRPVLOJBQMYXKRXGMOLRQNGFZCHCCOPJDIMAEJLJVHZHESWVLEHVFQEDOJOLXGEWPCIG
OPOELDZNTPATYUDIIYBAIRIMYMUQMDXKKCGVOPFFBQHCOCEAAPPAASCNNCCMCJQMYVGGZCTCZMRUNFIF
BGCPWXPTONMBXKJZOORFVZVBJKZFOBPSLMERPWXYWHLMQJOLQKSNKDOCJMPSPFPELJCCRNYYEEMSOSZR
JHKWZOFKZLGIYOMNVNDLEPCTCXDRRTBVUSMAFGWIINPZUSNZEFECWQESCBQKPLETADUTHMWVNUJFTYPJ
KMQTNGCGNHSJQBUMIZUVHTMQYZDMFRPXDNMIKHQHKTCLBINOLXLXRHVJQMKULDEVSEBJOOGSSSTUBDMP
RALSPAYWXFMYEPLOQLAKFNHSTPGKLKZCXGZIRANMAKRWYATIKBYVPAZXZRZITHOEMXJZYEQFKDUPWPBB
NNHNTMEESPFV

> nostrud nisi tempor amet incididunt qui est est est in
APLJMWOCAOUUTWRCIYDLUEXEMVZRGDNKWLBGESMBLSGNTIKQPFIADONYWUHFXUKQOZHXMDJVPTJACJYW
PICOMLULMXNJATTRIHMMRCEXSLBUUYFQFZOKXPVKOPQKALGWRFOFOFQWKJEHXIGENMPAIXYQASELZXDU
WDZLTKXINPUNDVCOBMXNAUIMGAVFZCHTSHSFVTQWUUMSKNPXUSSNLHOXYWECTXQTZZJKVINROIAQDCOW
QTPIBVOGOIPOJJRPVLOJBQMYXKEXGMOVRQNGFNCHCCPJZIMAEJLJVHZCEOWVLESVFQEDOJOLXGEWCIGO
POELDZNTPBTYUDIIYBAIRIMYMUQGBXGKCGVOPFFBQHCMCEAAPPAASCNNCSRMCJQMYVGBZCTCEMPGNFIF
BGPBOPTFVMBXKJZOORFVZVBJKZFOBPSLWERPWXPIHLMQJOLOKSZKDOCFMJSPFTELJCCRNYYEEMQOSZOJ
QKWZOFKZLGYYOMNJNONEPCTCXDRRTBVUSMEFGWLINPZUSNFEFECWEINBQMPLITADUTHMOVNUJZTYPJKM
QTNGCGNHSJRBULIZUVHTHYQYZDMFRPXDIMIKHQHKOCLBDNOLNLXRHNJMMQULDEVSEBJLOGSSTTUBDMQR
LSOYWXFMYEPLOWQLAKFNHSTYGKLKZCXMZICAIMABRWYATIKFYVPAZXZRZITOEMXJZYWQKDUPWFPVNHNT
MEESPFV

vincerubinetti · 2025-04-08T14:27:31Z

I believe I've implemented the requests above, though I would like someone to take a look at the clustal coloring logic to make sure I've interpreted it correctly.

epbrenner · 2025-04-15T21:24:59Z

Sorry again to keep you waiting on this @vincerubinetti! This looks good and ready to go. Will approve shortly.

The only (non-blocker) quirk I really see is when downloading the PDF in Dark Mode, the MSA goes to a white background (which I think is fine and a reasonable expected behavior), but it still retains the black background on amino acid positions that don't have a coloring assigned. Since you're using non-standard characters like J, this behavior isn't really an issue for them since you'll never see those in a biological sequence, but for gaps (-), these won't have properties so they'll have that black background normally. We could keep that, but I also think just going with the light mode white background for gaps would work well too.

tl;dr: The gap coloring is usually a lack of coloring because there's nothing there. The PDF gaps from a dark mode download are black against a white page background. That could be switched to white. Low priority.

epbrenner

Downloads and coloring logic all check out to me. Minor dark/light mode PDF download issue left in comments won't block merge. @ me if you'd like me to make an issue out of it to keep track of it for later!

vincerubinetti · 2025-04-15T23:25:18Z

Regarding the print background, I've changed it so it will respect the dark mode black background. But, since we're just using the browser's built in print dialog for this, the user will still have to allow background images/colors to be drawn. In Chrome, you have to manually check this option under "More settings", and there is no way we (via JavaScript) can check it for them:

Unfortunately this became a big PR because everything here was intertwined, and I wanted to (hopefully) solve all these related issues permanently. So, the diff is very big and difficult to review, and the individual commits won't be very clean or helpful. Instead I will mark certain parts of the code to look at. And know that converting the old SVGs to use the new chart wrapper component involved a lot of copy-pasting large blocks of code. Most importantly, please re-test every visualization component on the Netlify preview. At a high level, here are the changes: - implement upset plot - make generic chart wrapper component that handles most things automatically. auto-fits view box to content. SVG units match DOM units 1:1. see discussion below for how i landed on this particular design. - abstract out download button with various download formats as its own component. integrate into chart component. - chart text can be truncated automatically by actual rendered width (not number of chars) - include chart titles in every chart - include analysis id (or anything else we want) as part of every filename download. add fake analysis id to testbed page. - turn the legend component into fully SVG instead of DOM elements. implement simple wrapping strategy. - make MSA and IPR charts fully SVG, with no DOM elements - MSA now auto-wraps by default, and accurately (instead of loose estimates). when printing, - make charts resizable - add SVG group elements where appropriate for easier editing in vector software - consistently put settings vars and Props type at the top of each component file for quick reference - util func simplifications and refactors --- If you poke through the commit history, you can see some of the many complicated approaches I experimented with for handling sizing and positioning in the SVGs. I wanted to scale the SVG shapes up/down based on the available width but keep the font size matching the document's. This was already sort of implemented before this PR, where the viz components made repeated use of a useSvgTransform hook. But it was verbose, so I first attempted to extract it out to a single hook, and then a component. It required iteratively content-fitting and font-sizing back and forth, which would (usually) converge to a stable configuration. But it was very difficult to find a foolproof way to prevent infinite re-renders or weird edge cases. I also tried leveraging simply defining font size in the root of the SVG that matched the document's, then specifying positions/sizes in terms of `em` units, which the browser could use synchronously with no JS. But there are some limitations to this, like not being able to use relative units in things like `transform`s, and some weird exceptions that caused runaway calculations like when an `em`-sized element exceeded the available width of the container. Trying to implement automatic text truncation made this even more complicated. The MSA & IPR viz's also required a "fill remaining space" feature, i.e. have a column of labels on the left of a fixed width, and use all the remaining space on the right for the tracks/sequences. Another small problem to solve was that `getBBox`, the method used for fitting the view box to the SVG contents, doesn't work with `clip-path`'ed elements, which is needed for IPR's zoom area. There's much more that I tried or thought about and didn't commit, which has already left my brain, but I think it's not that valuable. What I landed on keeps things much simpler. The SVG units always match the DOM units, no exception (hopefully). If there isn't enough room on the page to show the SVG's content, instead of trying to shrink dynamically, it will just overflow and show scrollbars. Not only does this avoid needing iterative fitting, it's probably better UX wise since graphical elements stay in the same position and size -- instead of shrinking to an unreadable size -- and you just need to scroll over to see it all. Also this makes the testbed page much more performant. The "fill remaining" feature is accomplished by providing the available width to the consuming component, and letting it use it however it wants. --- Regarding the MSA auto-wrapping, currently the code is basically: ```ts setWrap(40); print() setWrap(oldWrap); ``` This is just a loose, hard-coded wrapping point that would generally not overflow the page width when printing portrait mode, US letter. This PR doesn't change our inability to look inside the print dialog, as discussed in #50 and [this comment](#52 (comment)). But I've taken out the hard-coded `40`. I figure now with the resize handle available on every chart, the user can pre-size the chart to whatever they want before opening the print dialog. I wanted to mention this because, if you look at the preview on a normal desktop or laptop screen width and have your print dialog on defaults, you'll usually see the MSA overflow the page, and I'm aware of it.

vincerubinetti added 4 commits March 5, 2025 17:16

basic msa viz

2f2e025

change color palette again, rearrange testbed

4b443ea

tweaks

1617d3b

import tailwind hex palette

ac624a7

vincerubinetti added 8 commits March 5, 2025 22:20

add index ticks

04970c4

use subgrid

5b6958a

add more labels

7914484

fix safari text vert align

3234af7

add basic download options

efb6072

implement print wrapping

10977d9

debounce network resize

9eb5639

style tweaks

8d6df9b

vincerubinetti requested review from epbrenner, falquaddoomi and jananiravi and removed request for falquaddoomi March 7, 2025 18:21

vincerubinetti added 5 commits March 7, 2025 13:33

fix build error

5c41804

fix axe failures

08efa14

increase timeout more

f333386

switch image libs, add jpg/png download

da81852

fix type error

c876370

falquaddoomi reviewed Mar 12, 2025

View reviewed changes

vincerubinetti added 2 commits March 12, 2025 16:47

tune auto-wrap to window width on png/jpg export

264a58e

fix prev commit

5b2525f

Merge remote-tracking branch 'origin/main' into msa

d649c11

improve tuning when expanded

6ba994f

falquaddoomi self-requested a review March 12, 2025 22:03

falquaddoomi approved these changes Mar 12, 2025

View reviewed changes

vincerubinetti added 5 commits March 13, 2025 13:54

fix continuous re-render bug in sunburst

5479f47

consistently use dash as "no value" indication

2feb485

implement clustal coloring

fd621de

simplify colors

c44e101

add legend component

2a3a1c7

vincerubinetti added 3 commits April 7, 2025 16:23

single col for seq

613f544

add info col

80282f5

allow arbitrary extra fields in download

6cbd2a1

add fasta download

5a3fbf6

space to dash

9719378

epbrenner approved these changes Apr 15, 2025

View reviewed changes

fix print bg color

2823b68

vincerubinetti merged commit c7430e9 into main Apr 15, 2025
4 checks passed

vincerubinetti deleted the msa branch April 15, 2025 23:26

vincerubinetti mentioned this pull request Apr 15, 2025

MSA viz #8

Closed

vincerubinetti mentioned this pull request Jun 5, 2025

Upset plot, svg/charting overhaul #55

Merged

Msa #50

Msa #50

Uh oh!

Conversation

vincerubinetti commented Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

netlify bot commented Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for molevolvr ready!

Uh oh!

falquaddoomi left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vincerubinetti commented Mar 12, 2025

Uh oh!

vincerubinetti commented Mar 12, 2025

Uh oh!

vincerubinetti commented Mar 12, 2025

Uh oh!

vincerubinetti commented Mar 12, 2025

Uh oh!

epbrenner commented Mar 12, 2025

Uh oh!

falquaddoomi commented Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vincerubinetti commented Mar 14, 2025

Uh oh!

jananiravi commented Apr 7, 2025

Uh oh!

vincerubinetti commented Apr 7, 2025

Uh oh!

falquaddoomi commented Apr 7, 2025

Uh oh!

vincerubinetti commented Apr 8, 2025

Uh oh!

epbrenner commented Apr 15, 2025

Uh oh!

epbrenner left a comment

Choose a reason for hiding this comment

Uh oh!

vincerubinetti commented Apr 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

vincerubinetti commented Mar 6, 2025 •

edited

Loading

netlify bot commented Mar 6, 2025 •

edited

Loading

falquaddoomi left a comment •

edited

Loading

falquaddoomi commented Mar 12, 2025 •

edited

Loading