Skip to content

regev-lab/PyIGV

Repository files navigation

PyIGV

Example Truncated Output Python alignment viewer library based on the Integrative Genomics Viewer (IGV) style for visualizing DNA/RNA sequence alignments.

PyPI - Python Version License

Overview

PyIGV provides a simple, intuitive way to visualize pairwise sequence alignments in Python. It displays alignments in an IGV-like format, with color-coded mismatches, insertions, and deletions.

Installation

pip install pyigv

Features

  • Color-coded visualization: Mismatches are highlighted with base-specific colors (A=green, T=red, G=gold, C=blue)
  • Automatic alignment: Uses Biopython's PairwiseAligner when alignment strings aren't provided
  • Gap handling: Automatically detects and visualizes insertions and deletions
  • Mutation counting: Tracks the number of insertions, deletions, and substitutions
  • PDF export: Save alignment visualizations to PDF files
  • Flexible display: Option to show full alignments or truncated views (hiding insertions)

Quick Start

from pyigv import Alignment, plot_alignments

# Define your sequences
target = "AAATAAA"
query = "AAAGAAA"

# Option 1: Auto-alignment (recommended)
aln = Alignment(target, query)

# Option 2: Provide pre-aligned sequences with gaps
alignment = ["AAATAAA", "AAAGAAA"]
aln = Alignment(target, query, alignment)

# Print alignment information
print(aln)
print(f"Mutations: {aln.mutation_ct}")
print(f"Insertions: {aln.insertion_ct}")
print(f"Deletions: {aln.deletion_ct}")

# Visualize alignments
alignments = [aln]
plot_alignments(alignments, title="Sample Alignment")

Usage Examples

Auto-Alignment (No Pre-alignment Required)

from pyigv import Alignment

# PyIGV automatically aligns sequences using Biopython
target = "AAACCCGGG"
query = "AAATTTGGG"

aln = Alignment(target, query)
print(f"Mutations: {aln.mutation_ct}")
print(f"Insertions: {aln.insertion_ct}")
print(f"Deletions: {aln.deletion_ct}")

Basic Alignment with Manual Alignment Strings

from pyigv import Alignment

# Perfect match
target = "AAAA"
query = "AAAA"
alignment = ["AAAA", "AAAA"]
aln = Alignment(target, query, alignment)
print(f"Mutations: {aln.mutation_ct}")  # Output: 0

Alignment with Mismatch

# Single mismatch at position 3
target = "AAAA"
query = "AAAT"
alignment = ["AAAA", "AAAT"]
aln = Alignment(target, query, alignment)
print(f"Mutations: {aln.mutation_ct}")  # Output: 1

Alignment with Insertion

# Insertion in query
target = "AAAA"
query = "AAAAA"
alignment = ["AAAA-", "AAAAA"]  # '-' indicates gap in target
aln = Alignment(target, query, alignment)
print(f"Insertions: {aln.insertion_ct}")  # Output: 1

Alignment with Deletion

# Deletion in query
target = "AAAAA"
query = "AAAA"
alignment = ["AAAAA", "AAAA-"]  # '-' indicates gap in query
aln = Alignment(target, query, alignment)
print(f"Deletions: {aln.deletion_ct}")  # Output: 1

Plotting Multiple Alignments

from pyigv import Alignment, plot_alignments
import matplotlib.pyplot as plt

target = "AAACCCGGGTTTATATATAT"

# Create multiple query sequences
queries = [
    "AAACCCGGGTTTATATATAT",  # Perfect match
    "AAAGCCGGGTTTATATATAT",  # One mismatch
    "AAACCCGGGTTTTATATAT",   # One deletion
    "AAACCCGGGTTTATATATATAT", # One insertion
    "AAATTTGGGAAACCCCCCCC",  # Multiple changes
]

# Auto-align all queries against the target
alignments = [Alignment(target, query) for query in queries]

# Plot and display
plot_alignments(alignments, title="Multiple Query Comparison")
plt.show()

Saving to PDF

from pyigv import plot_alignments
from matplotlib.backends.backend_pdf import PdfPages

# Create your alignments
target = "AAATAAA"
queries = ["AAAGAAA", "AAACAAA", "AAAAAAA"]
alignments = [Alignment(target, q) for q in queries]

# Save to PDF
with PdfPages("alignment_output.pdf") as pdf:
    plot_alignments(alignments, title="My Alignments", pdf=pdf)

Truncated View (Default)

By default, PyIGV uses truncated view to focus on the reference sequence. In truncated mode, insertions are displayed as purple boxes with numbers indicating insertion length:

plot_alignments(
    alignments,
    title="Truncated View"
    # truncate=True is the default
)

To show full alignments including all insertions, set truncate=False:

plot_alignments(
    alignments,
    title="Full View",
    truncate=False  # Show all insertions in full
)

API Reference

Alignment Class

Constructor

Alignment(target: str, query: str, alignment: Optional[Sequence[str]] = None)

Parameters:

  • target: The target (reference) sequence
  • query: The query sequence
  • alignment (optional): A list/tuple of two strings representing the aligned sequences with gaps marked as '-'. If not provided, uses Biopython's PairwiseAligner to automatically align the sequences.

Attributes

  • target: Target sequence (without gaps)
  • query: Query sequence (without gaps)
  • target_alignment: Aligned target sequence with gaps
  • query_alignment: Aligned query sequence with gaps
  • symbols: Processed alignment symbols
  • edits: Edit operations (I=insertion, D=deletion, M=mismatch, space=match)
  • insertion_ct: Number of insertions
  • deletion_ct: Number of deletions
  • mutation_ct: Number of mismatches/substitutions

Methods

  • get_color_row(truncate: bool = False): Get color codes for visualization
  • get_symbols(truncate: bool = False): Get alignment symbols
  • get_insertion_indices(): Get positions and lengths of insertions
  • __lt__(other): Compare alignments by number of edits (for sorting)

plot_alignments Function

plot_alignments(
    alignments,
    title: Optional[str] = None,
    pdf: Optional[str] = None,
    truncate: bool = True,
    return_fig: bool = False
) -> Optional[plt.Figure]

Parameters:

  • alignments: List of Alignment objects to visualize
  • title (optional): Title for the plot. If not provided, defaults to "Alignments"
  • pdf (optional): PdfPages object for saving to PDF
  • truncate (optional): If True (default), removes insertions from display and shows them as numbered purple boxes. Set to False to show full alignments.
  • return_fig (optional): If True, returns the Figure object instead of None

Returns:

  • matplotlib Figure object if return_fig=True, otherwise None

Color Scheme

  • Green (A): Adenine mismatches or insertions
  • Red (T): Thymine mismatches or insertions
  • Gold (G): Guanine mismatches or insertions
  • Blue (C): Cytosine mismatches or insertions
  • Gray: Matches
  • White: Deletions
  • Purple boxes (truncate mode): Insertion indicators with length

Example Output

Here's what a typical PyIGV visualization looks like:

from pyigv import Alignment, plot_alignments

target = "AAACCCGGGTTTATATATAT"
queries = [
    "AAACCCGGGTTTATATATAT",  # Perfect match
    "AAAGCCGGGTTTATATATAT",  # One mismatch
    "AAACCCGGGTTTTATATAT",   # One deletion
    "AAACCCGGGTTTATATATATAT", # One insertion
    "AAATTTGGGAAACCCCCCCC",  # Multiple changes
]

alignments = [Alignment(target, q) for q in queries]
plot_alignments(alignments, title="Example Alignment Visualization")

Normal View

Example Alignment Output

The output shows:

  • The reference sequence in the top row
  • Each query alignment in subsequent rows
  • Color-coded differences (mismatches, insertions, deletions)
  • Sorted by alignment quality (best matches first)

Truncated View (Default)

By default, sequences with insertions are shown in truncated view with insertion counts as purple boxes:

Example Truncated Output

# Default behavior (truncate=True)
plot_alignments(alignments, title="Example Truncated View")

Development

Running Tests

# Install development dependencies
pip install -e ".[dev]"

# Run tests
pytest tests/

# Run specific test
pytest tests/test_alignment.py::test_plot_alignments_with_multiple_queries -v -s

Code Quality

# Format code with Black
black src/ tests/

# Lint code
flake8 src/ tests/

Requirements

  • Python 3.7+
  • numpy >= 1.19.0
  • matplotlib >= 3.3.0
  • biopython >= 1.86

License

MIT License - see LICENSE file for details

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Citation

If you use PyIGV in your research, please cite:

PyIGV: Python alignment viewer library
https://github.com/regev-lab/PyIGV

Support

For issues, questions, or contributions, please visit:

Changelog

v0.1.0

  • Initial release
  • Color-coded alignment visualization
  • Automatic alignment using Biopython
  • PDF export support
  • Truncated view mode for insertions
  • Comprehensive test suite

About

Python alignment creator/viewer based on Integrative Genomics Viewer (IGV)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages