-
Notifications
You must be signed in to change notification settings - Fork 10
Msa #50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Msa #50
Conversation
✅ Deploy Preview for molevolvr ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The graphic looks great IMO! I tried out the following features:
- Keyboard navigation: I can tab through each panel and scroll them with the arrow keys as expected.
- Wrapping: seems to work as expected, with additional rows being added as the wrap width is decreased. The window does indeed remain scrolled so that the mouse is under the button you're clicking, even as the height of the page is changing, which is pretty cool. 🙂
- Width Toggle: works as expected, and IMO a nice quality-of-life feature.
- Exports:
- the PNG and JPEG exports appear to export the image as it appears in the page. One thing I noticed is that the PNG and JPEG previews appear to show the current view, but the panel in the resulting image is scrolled to 0, rather than showing the portion of the panel where the user had scrolled. (More thoughts in [1].)
- The TSV export works well, although if it's not a standard format you might consider adding a header column.
- The PDF export works, and IMO the result looks great.
[1] I don't know how big of a problem this is, if at all; I also don't know if there's a way to get the package you're using to manipulate the scroll position in the exported image. If there isn't, you could consider setting the wrap prior to export to a value for which the contents of the panel won't overflow and require scrolling.
I think in this case column headers aren't super useful? After the first column, it'd just be the sequence position. I guess I'd leave that call up to Janani & Co. It would just look something like: Maybe it could be more useful if we include the top "combined" row to the tsv export. Not sure how to format that... just choose the most common character in each column? show all the %s like |
|
Regarding the jpg/png export issue: I didn't notice that scroll issue. I would've assumed that the library would preserve whatever was on screen. It looks like this is a known issue: https://github.com/bubkoo/html-to-image/issues?q=is%3Aissue%20state%3Aopen%20scroll%20position I'm not sure the library could even fix the issue though, as I think it would lie in the browser's internal painting of the elements to canvas. One workaround is instead of centering a particular element in the view using the browser's normal overflow, you'd (temporarily) hard code a fixed offset position of everything in the container with CSS, as described in some of those issues. That's very dirty though and I'd like to avoid it. Also, my thought is that if a user wants something to be visible in the png and were going to manually scroll to it anyway, they could also manually set the wrap point to ensure it's visible without a scrollbar at all. Another option is to temporarily automatically set the wrap point ourselves right before the user exports as png, then reset it afterward. This is what I'm doing with the print option. But in the case of printing, we can usually expect the screen size to be a standard US letter, pretty wide. But with the png, it depends on the user's window width. I'll try experimenting with this. An exact calculation of the wrap point would be complex, but I could either loosely base it on window width, or iteratively reduce it until there's no scrollbars. |
The iterative can work, but causes a visible lag because you need to decrease the wrap, wait for it to render, check if a scrollbar is there, decrease again, etc. The rough tuning seems to work well enough, though when CSS on the page changes (widths of stuff) in the future, a maintainer will have to remember to re-tune it. |
|
@falquaddoomi good catch, the latest commit should improve the spacing there (but still not perfect because it's just a rough tuning). |
People going to download MSAs are usually expecting one of a few possible plaintext formats. These are usually Clustal, FASTA/Pearson, and sometimes NEXUS. If somebody wants to download the MSA not in a graphical format, they'll probably want it in a format like that so they can use it in other software downstream (like building their own custom phylogenetic trees). The raw input MSA that goes into this is the simplest plaintext output to provide a user, but if they want the And this separate file would contain only the combined consensus sequence as a regular FASTA format sequence and no other sequences. At least that's what I think may be the most broadly applicable, but I'd like thoughts from @jananiravi too! |
@vincerubinetti: Your latest change seems to have fixed it for me! I haven't noticed much roughness myself, but IMO whoever's downloading the PNG/JPEG format can edit to their liking, if their intent is to use it for a publication or something. If they want a more predictable output format they can download it as a PDF, and for a more flexible format there's the TSV. |
|
@falquaddoomi After you approved, I noticed some critical bugs I wanted to include, and also ended up including a legend and the clustal coloring scheme that was requested. I'm also still waiting to hear back about what TSV download format is desired, which I'll also add to this PR. @epbrenner Could you take a look at the code in |
The MSA would be in a fasta format -- it will work later w/ any of these tools: https://www.ebi.ac.uk/jdispatcher/msa Reg. @falquaddoomi's comment:
I agree that for a simple tsv, this should suffice Reg. @epbrenner's comment: Was the MSA coloring part of a different PR or resolved comments above? I know Evan had some thoughts on those -- can't find that anymore. |
If you're referring to the clustal coloring, that is implemented and in the preview.
I've just made changes to download in this format. You can put label/name, sequence, and any additional field (not just accnumber) as a column in the download. |
|
Hey @vincerubinetti, as @jananiravi mentioned above, in our prior meeting @epbrenner mentioned that having FASTA as an additional download type could be useful for downstream analyses. Sorry that I'm just mentioning this now, and hopefully it's not too hard to add. FASTA is a text format in which each sequence is preceded by a "header" line. The header starts with "> " and then lines that follow it are the sequence for that header. Multiple sequences can be included in one file, each initiated by a header line. AFAIK it's common for a FASTA header to start with the accession number, but I'm unaware of specifics beyond that; maybe @jananiravi and @epbrenner can provide some advice. In our case, it might look like |
|
I believe I've implemented the requests above, though I would like someone to take a look at the clustal coloring logic to make sure I've interpreted it correctly. |
|
Sorry again to keep you waiting on this @vincerubinetti! This looks good and ready to go. Will approve shortly. The only (non-blocker) quirk I really see is when downloading the PDF in Dark Mode, the MSA goes to a white background (which I think is fine and a reasonable expected behavior), but it still retains the black background on amino acid positions that don't have a coloring assigned. Since you're using non-standard characters like J, this behavior isn't really an issue for them since you'll never see those in a biological sequence, but for gaps (-), these won't have properties so they'll have that black background normally. We could keep that, but I also think just going with the light mode white background for gaps would work well too. tl;dr: The gap coloring is usually a lack of coloring because there's nothing there. The PDF gaps from a dark mode download are black against a white page background. That could be switched to white. Low priority.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Downloads and coloring logic all check out to me. Minor dark/light mode PDF download issue left in comments won't block merge. @ me if you'd like me to make an issue out of it to keep track of it for later!
Unfortunately this became a big PR because everything here was intertwined, and I wanted to (hopefully) solve all these related issues permanently. So, the diff is very big and difficult to review, and the individual commits won't be very clean or helpful. Instead I will mark certain parts of the code to look at. And know that converting the old SVGs to use the new chart wrapper component involved a lot of copy-pasting large blocks of code. Most importantly, please re-test every visualization component on the Netlify preview. At a high level, here are the changes: - implement upset plot - make generic chart wrapper component that handles most things automatically. auto-fits view box to content. SVG units match DOM units 1:1. see discussion below for how i landed on this particular design. - abstract out download button with various download formats as its own component. integrate into chart component. - chart text can be truncated automatically by actual rendered width (not number of chars) - include chart titles in every chart - include analysis id (or anything else we want) as part of every filename download. add fake analysis id to testbed page. - turn the legend component into fully SVG instead of DOM elements. implement simple wrapping strategy. - make MSA and IPR charts fully SVG, with no DOM elements - MSA now auto-wraps by default, and accurately (instead of loose estimates). when printing, - make charts resizable - add SVG group elements where appropriate for easier editing in vector software - consistently put settings vars and Props type at the top of each component file for quick reference - util func simplifications and refactors --- If you poke through the commit history, you can see some of the many complicated approaches I experimented with for handling sizing and positioning in the SVGs. I wanted to scale the SVG shapes up/down based on the available width but keep the font size matching the document's. This was already sort of implemented before this PR, where the viz components made repeated use of a useSvgTransform hook. But it was verbose, so I first attempted to extract it out to a single hook, and then a component. It required iteratively content-fitting and font-sizing back and forth, which would (usually) converge to a stable configuration. But it was very difficult to find a foolproof way to prevent infinite re-renders or weird edge cases. I also tried leveraging simply defining font size in the root of the SVG that matched the document's, then specifying positions/sizes in terms of `em` units, which the browser could use synchronously with no JS. But there are some limitations to this, like not being able to use relative units in things like `transform`s, and some weird exceptions that caused runaway calculations like when an `em`-sized element exceeded the available width of the container. Trying to implement automatic text truncation made this even more complicated. The MSA & IPR viz's also required a "fill remaining space" feature, i.e. have a column of labels on the left of a fixed width, and use all the remaining space on the right for the tracks/sequences. Another small problem to solve was that `getBBox`, the method used for fitting the view box to the SVG contents, doesn't work with `clip-path`'ed elements, which is needed for IPR's zoom area. There's much more that I tried or thought about and didn't commit, which has already left my brain, but I think it's not that valuable. What I landed on keeps things much simpler. The SVG units always match the DOM units, no exception (hopefully). If there isn't enough room on the page to show the SVG's content, instead of trying to shrink dynamically, it will just overflow and show scrollbars. Not only does this avoid needing iterative fitting, it's probably better UX wise since graphical elements stay in the same position and size -- instead of shrinking to an unreadable size -- and you just need to scroll over to see it all. Also this makes the testbed page much more performant. The "fill remaining" feature is accomplished by providing the available width to the consuming component, and letting it use it however it wants. --- Regarding the MSA auto-wrapping, currently the code is basically: ```ts setWrap(40); print() setWrap(oldWrap); ``` This is just a loose, hard-coded wrapping point that would generally not overflow the page width when printing portrait mode, US letter. This PR doesn't change our inability to look inside the print dialog, as discussed in #50 and [this comment](#52 (comment)). But I've taken out the hard-coded `40`. I figure now with the resize handle available on every chart, the user can pre-size the chart to whatever they want before opening the print dialog. I wanted to mention this because, if you look at the preview on a normal desktop or laptop screen width and have your print dialog on defaults, you'll usually see the MSA overflow the page, and I'm aware of it.


html-to-image(better maintained fork ofdom-to-imagethandom-to-image-more)I'll shortly also be adding JPEG/PNG export to the MSA viz as well. I won't be adding SVG export, because the way it's currently constructed using several separate SVGs. I could reconstruct the whole viz as one large SVG, which would have the following tradeoffs: allow single SVG download, more straightforward but much more verbose positioning of elements, less flexible and harder to make changes to layout, not easily made responsive to screen size width.