MotifFinding – Mini-Project (Bioinformatics Practice)

This mini-project implements a simple motif search tool in Python, as part of my bioinformatics learning roadmap.
The goal is to practice working with DNA strings, file input/output, and clean, readable code that could fit into a basic bioinformatics pipeline. Duration : 1 full day 11/16/2025

1. Project structure

MotifFinding/
    data/
        sequence.txt        # Input DNA sequence
        motif.txt           # Input motif/pattern to search for
    outputs/
        motif_positions.txt # Output: 1-based positions of the motif in the sequence
    src_perso/
        motif_finding.py    # Main script with functions and examples
    README.md

2. Core idea

Given:

a DNA sequence (for example: GATATATGCATATACTT)
a motif/pattern (for example: ATAT)

the script finds all 1-based starting positions where the motif appears in the sequence and writes them into a text file.

Example output:

2 4 10

This corresponds to the classical “Finding a Motif in DNA” exercise, widely used in introductory bioinformatics training.

3. Implementation details

All the core logic is in src_perso/motif_finding.py and is organized in three main functions:

find_motif(sequence: str, pattern: str) -> list[int]
- Cleans the inputs (removes spaces/newlines, converts to uppercase)
- Slides a window across the sequence
- Returns all 1-based positions where the motif matches
load_from_files(seq_path: str, motif_path: str) -> tuple[str, str]
- Reads a DNA sequence and a motif from two text files
- Returns both as raw strings
save_positions(positions: list[int], out_path: str)
- Saves the list of positions into a text file
- If there are no matches, writes: No occurrences found.

The if __name__ == "__main__": block contains:

Example 1: simple in-memory example with hard-coded strings
Example 2: realistic use case using data/ → outputs/

4. How to run the project

From the src_perso directory:

cd src_perso
python motif_finding.py

You should see in the terminal:

Example 1: direct test with sequence and motif
Example 2: results loaded from the files in data/

The positions will be saved automatically in:

../outputs/motif_positions.txt

5. Example input files

data/sequence.txt

GATATATGCATATACTT

data/motif.txt

ATAT

This produces the following output:

2 4 10

6. Requirements

Python 3.8+
No external libraries required

7. Learning notes

This mini-project is part of a bioinformatics learning track where I practice:

DNA string manipulation in Python
Reading and writing text files
Building clean and well-documented functions
Organizing mini-projects for GitHub (folders, scripts, README)

Possible extensions include:

FASTA file handling
Using Biopython
Searching motifs in larger genomes
Scanning for multiple motifs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MotifFinding – Mini-Project (Bioinformatics Practice)

This mini-project implements a simple motif search tool in Python, as part of my bioinformatics learning roadmap.
The goal is to practice working with DNA strings, file input/output, and clean, readable code that could fit into a basic bioinformatics pipeline. Duration : 1 full day 11/16/2025

1. Project structure

2. Core idea

3. Implementation details

4. How to run the project

5. Example input files

6. Requirements

7. Learning notes

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
outputs		outputs
src_perso		src_perso
README.md		README.md

yasmina-bioinfo/MotifFinding

Folders and files

Latest commit

History

Repository files navigation

MotifFinding – Mini-Project (Bioinformatics Practice)

This mini-project implements a simple motif search tool in Python, as part of my bioinformatics learning roadmap. The goal is to practice working with DNA strings, file input/output, and clean, readable code that could fit into a basic bioinformatics pipeline. Duration : 1 full day 11/16/2025

1. Project structure

2. Core idea

3. Implementation details

4. How to run the project

5. Example input files

6. Requirements

7. Learning notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

This mini-project implements a simple motif search tool in Python, as part of my bioinformatics learning roadmap.
The goal is to practice working with DNA strings, file input/output, and clean, readable code that could fit into a basic bioinformatics pipeline. Duration : 1 full day 11/16/2025

Packages