Skip to content

Sam-Sims/salti

Repository files navigation

GitHub Release crates.io

test check

Salti

salti is a terminal based multiple sequence alignment (MSA) viewer for FASTA files. It is designed for fast interactive browsing primarily on remote servers, and HPC environments, or anytime you dont want to leave the terminal.

Quick start

If using linux/macOS

curl --proto '=https' --tlsv1.2 -LsSf https://github.com/Sam-Sims/salti/releases/download/salti-0.8.0/salti-installer.sh | sh

or conda (All supported platforms)

conda install -c bioconda salti

Contents

Feature showcase

Fast

salti is built for fast browsing and loading of large alignments, using tokio for async processing.

It can handle alignments with thousands of sequences and positions without slowdown.

Transparent support for HTTP/SSH/Compressed files

Thanks to the cool Paraseq library salti can transparently load compressed fasta files, as well as files over HTTP/HTTPS or SSH. Just provide the URL or SSH path to the load command, e.g. :load https://example.com/alignment.fasta or :load ssh://user@host/path/to/alignment.fasta.

Command palette

Press : to open a command palette for most actions. See Usage for details.

command_palette

Mouse support

Left click to select a sequence/position, Ctrl+left click to select a region.

Hold middle mouse to pan around the alignment.

mouse

Minimap

Press m to open the minimap and drag to quickly pan around.

minimap

Translation

Can translate NT codons to AA on the fly, with support for all 3 frames, although designed for browsing, rather than a dedicated translation tool.

translate

Useful viz tools

  • Collapse positions that match the reference or consensus to . for easier visualisation of differences.
  • Mouse selection to highlight regions or sequences
  • Pin important sequences fixed at the top while browsing.

viz

GFF support

Support for loading GFF3 annotations with a global minimap + local feature track, and jump straight to features with jump-feature.

See [### GFF support](GFF support) for more detail.

gff

Themes

salti supports multiple colour themes, which can be switched with the set-theme command. Available themes so far are:

  • everforest-dark - the default theme, based on the everforest colorscheme.
  • solarized-light - a light theme based on the solarized palette.
  • tokyo-night - a dark theme based on the tokyo night palette.
  • terminal-default - uses terminal-provided ANSI colours and defaults.

themes

Installation

Conda:

conda install -c bioconda salti

Binaries:

Precompiled binaries for Linux, MacOS and Windows are attached to the latest release.

You can also use the installer script:

curl --proto '=https' --tlsv1.2 -LsSf https://github.com/Sam-Sims/salti/releases/download/salti-0.8.0/salti-installer.sh | sh

Cargo:

Requires cargo

cargo install salti

Build from source:

Install rust toolchain:

To install please refer to the rust documentation: docs

Clone the repository:

git clone https://github.com/Sam-Sims/salti

Build and add to path:

cd salti
cargo build --release
export PATH=$PATH:$(pwd)/target/release

All executables will be in the directory salti/target/release.

Usage

I would recommend a modern terminal, with GPU acceleration and 24-bit colour (true colour) support for the best experience. Additionally, a font with Unicode support is required to render all characters correctly.

I would also recommend a large terminal window for the best experience, but the app is designed to be usable even in smaller windows.

salti is primarily developed and benchmarked in Ghostty. I have tested it on several other terminals and it works well in most cases, though performance may vary. If you have responsiveness issues, please open an issue and include your terminal emulator and font information.

Note: If you're using Ghostty 1.3.0+ and the middle mouse button isn't working, check that gtk-enable-primary-paste is enabled:

gsettings set org.gnome.desktop.interface gtk-enable-primary-paste true

See here for details.

How to run

salti <alignment.fasta>

If no file is passed, the app starts and waits for you to load one via the command palette.

Quick start keybinds

I plan to add a help screen in the future for reference in app, but for now here are the most useful keybindings:

Global keybindings

  • q - Quit.
  • : - Opens the command palette.
  • Up / Down - Scroll vertically 1 row
  • Left / Right - Scroll horizontally 1 column.
  • Shift + Left/Right/Up/Down scrolls 10 columns/rows in that direction.
  • Alt+Left / Alt+Right - Scroll sequence name pane.
  • Left cick - Select a sequence or position. Click again to clear selection.
  • Ctrl + Left click - Select a range of sequences or positions
  • Middle click + drag - Pan.
  • m - Open the minimap
  • t - Toggle between quick translate of the current view.
  • Shift+t - Toggle between the original alignment and the reloaded protein alignment.
  • Alt+1|2|3 - Change reading frame for quick/full translation.

Command palette

Most features can be accessed through the command palette (this is heavily inspired by the helix editors implementation!).

Open with :, then type a command.

  • Enter confirms selection.
  • Tab / Shift+Tab cycles any candidates.
  • Esc closes the palette.

Commands:

  • jump-position - Jump to a 1-based alignment position.
  • jump-sequence - Jump to a sequence by name
  • jump-feature - Jump to a feature if a GFF file has been loaded.
  • pin-sequence - Pin a visible sequence to the top of the alignment view.
  • unpin-sequence - Remove a sequence from the pinned group.
  • filter-rows - Filter rows by their IDs (fasta headers) via regex.
  • filter-gaps - Filter columns by their gap percentage.
  • filter-constant - Filter columns by their similarity.
  • clear-filter - Clear the active filter.
  • set-reference - Set a reference sequence .
  • toggle-translate - Toggle AA translation.
  • reload-as-protein - Reloads the entire alignment as a protein alignment.
  • set-diff-mode - Set diff rendering mode (off, reference, or consensus).
  • load-alignment (alias: load) - Load an alignment file.
  • load-gff - Load a GFF3 annotation file.
  • set-consensus-method - Choose majority or majority-non-gap.
  • set-translation-frame - Set translation frame (1, 2, or 3).
  • set-theme - Set active theme (everforest-dark, solarized-light, tokyo-night, or terminal-default).
  • set-sequence-type - Override auto-detection if it fails (dna, protein, or generic).
  • check-update - Check for updates and show the latest version.
  • quit - Quit the app.

Some notes on features

until I write proper docs

Fuzzy matching

All commands that take string input support fuzzy matching. For example, jump-sequence will match any sequence name that contains the input string. If multiple candidates match, you can cycle through them with Tab / Shift+Tab.

Consensus method

Two methods are available for consensus calculation:

  • majority - The most common character at each position, including gaps.
  • majority-non-gap - The most common character at each position, excluding gaps

If there is a tie for most common character, one is chosen at random.

The defailt is majority-non-gap

Quick translate vs full translate

salti offers two methods of translating. The first is "quick translate" toggled by pressing t or the toggle-translate command, named as such because it only translates the visible view and so is technically faster than the full translate

Quick translate maps the currently visible nucleotides into their respetive codons according to the current frame and then renders them as an amino acid overlay. This does not change the coordinate space - and the ruler will stay in nucleotides. Clicking on an amino acid however will display the coordinate in protein space in the bottom status bar.

Importantly quick translate can not be used if any of the columns have been filtered - as this would cause frameshifts and would not make sense to represent.

In contrast full translate (Shift+T or the reload-as-protein command) takes the full alignment and reloads it into a protein alignment, as if you had loaded a fasta with the translated amino acids instead of nucleotides. This means the coordinate space becomes amino acids - and each amino acid is represented by one column. This means full translate is compatible with column filtering, unlike quick translate.

In practice this is still very fast (~3000 mpox virus sequences which are about 200kb big can be translated in under 0.1 seconds on my PC) and so it is useful to switch between different translation modes depending on what is useful.

Reading frame can be changed at any time using Alt+1, Alt+2, Alt+3, or the set-translation-frame command.

Gap filtering

filter-gaps hides columns whose gap fraction is above the threshold you give it. The threshold is a percentage, so filter-gaps 25 removes columns where 25% of positions are gaps. You can use filter-gaps 0 to clear the filter.

Gap filtering changes the visible coordinate space. The ruler still shows absolute positions, but marks jumps where hidden columns have been skipped. A single jump is shown with an arrow pointing towards the side that has a jump. Dense regions of skipped columns are shown as a run of ~ characters rather than individual arrows.

Gap filtering and quick translation cannot be used at the same time. If quick translation is active, filter-gaps will be rejected. Likewise if a gap filter is active, quick translation cannot be enabled until the filter is cleared. The full translation (e.g using reload-as-protein) supports column filtering however.

All column filters are applied after row filters.

Constant filtering

filter-constant hides columns where any given value in the column meets the threshold you give it. Like gap filtering, the threshold is also a percentage so filter-constant 100 hides columns where 100% of the positions are the same (i.e removes constant sites).

Like gap filtering - removing columns can change the visible coordinate space - see gap filtering for a more detailed breakdown of what that means.

filter-constant 99 hides columns where 99% of the positions are the same and so on. filter-constant 0 clears the filter.

NOTE: Gaps are ignored when calculating constant fractions, in DNA alignments N is also ignored and in protein, X is also ignored

GFF support

salti can load and display GFF3 annotations from a file with the load-gff command. This is currently experimental.

At the moment only gene features are supported.

When a GFF is loaded salti will show:

  • a global feature pane
  • a local feature track above the alignment
  • feature details in the Feature Info pane when you hover over a feature
  • support for the jump-feature command in the command palette

You can also drag in the global feature pane to pan around the alignment.

Currently annotations are treated as "global" - and per sequence annotations are not supported. This also means there is no fancy business that tries to match GFF coordinates to gaps etc. This is mainly useful in specific use cases e.g:

When you have multiple alignments to a reference sequence where that reference does not have gaps inserted (i.e insertions are ignored). For example running mafft with something like mafft --add --keeplength or using alignments from something like nextclade or fastalign.

In the normal nucleotide view, GFF coordinates are shown in nucleotide space. Quick translate keeps this the same - because quick translate is just an amino acid overlay on top of the nucleotide view. In full translate / reload-as-protein, the features are projected into protein columns using the active reading frame.

If columns are filtered, features are projected onto the remaining visible columns. This means a feature can look compressed, or disappear entirely if all of its columns are filtered out.

Pinned behaviour

  • Pinned sequences stay visible and remain at the top, even when they do not match the active filter.
  • Setting a sequence as reference removes it from pinned state and hides it as the reference row.

Data/Rendering

  • Input must be FASTA with equal sequence lengths across records.
  • Sequence type is auto-detected on load; you can override it if its wrong.
    • It samples up to 100 random alignments and compares NT and AA character fractions. If neither crosses 50%, it falls back to generic mode.

Update check:

salti will check for updates on startup and notify you if a new version is available. It does this by querying the crates api: https://crates.io/api/v1/crates. However, no network connection is required to use salti, and will not cause any issues if the check fails.

This behaviour can be disabled entirely by setting SALTI_SKIP_UPDATE_CHECK=true. You can still check for updates manually with the check-update command.

About

A modern multiple sequence alignment browser - built for the terminal.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages