Skip to content

Navigation Menu

Appearance settings

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

Appearance settings

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

shenwei356 / seqkit Public

Notifications You must be signed in to change notification settings
Fork 169
Star 1.4k

Code
Issues 18
Pull requests 4
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Releases: shenwei356/seqkit

Releases · shenwei356/seqkit

SeqKit v2.4.0

17 Mar 09:05

shenwei356

Compare

Choose a tag to compare

Loading

SeqKit v2.4.0

Changes

SeqKit v2.4.0 - 2023-03-17
- seqkit:
  - support bzip2 format. #361
  - support setting compression level for gzip, zstd, and bzip2 format via --compress-level. #320
  - the global flag --infile-list accepts stdin (-) now.
  - wrap the help message of flags.
- seqkit locate:
  - do not remove embeded regions when searching with regular expressions. #368
- seqkit amplicon:
  - fix BED coordinates for amplicons found in the minus strand. #367
- seqkit split:
  - fix forgetting to add extension for --two-pass. #332
- seqkit stats:
  - fix compute Q1 and Q3 of sequence length for one record. #353
- seqkit grep:
  - fix count number (-C) for matching with mismatch (-m > 0). #370
- seqkit replace:
  - add some flags to match partly records to edit; these flags are transplanted from seqkit grep. #348
- seqkit faidx:
  - allow empty lines at the end of sequences.
- seqkit faidx/sort/shuffle/split/subseq:
  - new flag -U/--update-faidx: update the FASTA index file if it exists, to guarantee the index file matches the FASTA files. #364
  - improve log info and update help message. #365
- seqkit seq:
  - allow filtering sequences of length zero. thanks to @penglbio.
- seqkit rename:
  - new flag -s/--separator for setting separator between original ID/name and the counter (default "_"). #360
  - new flag -N/--start-num for setting starting count number for duplicated IDs/names (default 2). #360
  - new flag -1/--rename-1st-rec for renaming the first record as well. #360
  - do not append space if there's no description after the sequene ID.
- seqkit sliding:
  - new flag -S/--suffix for change the suffix added to the sequence ID (default: "_sliding").

Contributors

penglbio

Assets 9

Loading

Uh oh!

There was an error while loading. Please reload this page.

darked89 and dbespiatykh reacted with thumbs up emoji

lch14forever, juanvillada, and diegomics reacted with rocket emoji

All reactions

👍 2 reactions
🚀 3 reactions

5 people reacted

SeqKit v2.3.1

22 Sep 09:25

shenwei356

Compare

Choose a tag to compare

Loading

SeqKit v2.3.1

Changes

SeqKit v2.3.1 - 2022-09-22
- seqkit grep/locate: fix bug of FMIndex building for empty sequences. #321
- seqkit split2: fix bug of splitting two FASTA files. #325
- seqkit faidx: --id-regexp works now.

Assets 9

Loading

Uh oh!

There was an error while loading. Please reload this page.

dbespiatykh and tshauck reacted with thumbs up emoji

All reactions

👍 2 reactions

2 people reacted

SeqKit v2.3.0

12 Aug 15:19

shenwei356

Compare

Choose a tag to compare

Loading

SeqKit v2.3.0

Changes

SeqKit v2.3.0 - 2022-08-12
- seqkit grep/rename:
  - reduce memory comsumption for a lot of searching patterns, and it's faster. #305
  - 2X faster -s/--by-seq.
- seqkit split
  - fix outputting an empty file when the number of sequence equal to the split size. #293
  - add options to set output file prefix and extention. #296
- seqkit split2
  - reduce memory consumption. #304
  - add options to set output file prefix
- seqkit stats:
  - add GC content. #294

Assets 9

Loading

Uh oh!

There was an error while loading. Please reload this page.

cmdcolin, UriNeri, and juanvillada reacted with thumbs up emoji

All reactions

👍 3 reactions

3 people reacted

SeqKit v2.2.0

14 Mar 11:42

shenwei356

Compare

Choose a tag to compare

Loading

SeqKit v2.2.0

Changes

SeqKit v2.2.0 - 2022-03-14
- seqkit:
  - add support of zx and zstd input/output formats. #274
  - fix panic when reading records with header of ID + blanks.
- new command seqkit sum: computing message digest for all sequences in FASTA/Q files.
  The idea comes from @photocyte and the format borrows from seqhash #262
- new command seqkit fa2fq: retrieving corresponding FASTQ records by a FASTA file
- seqkit split2:
  - new flag -e/--extension for forcing compresson or changing compression format. #276
  - support changing output prefix via -o/--out-file. #275
- seqkit concat:
  - fix handling of multiple seqs with the same ID in one file. #269
  - performaning out/full join. #270
  - preserve the comments. #271
- seqkit locate:
  - parallelizing -F/--use-fmi and -m for large number of search patterns.
- seqkit amplicon:
  - new flag -M/--output-mismatches to append the total mismatches and mismatches of 5' end and 3' end. #286
- seqkit grep:
  - detect FASTA/Q symbol @ and > in the searching patterns and show warnings.
  - add new flag -C/--count, like grep -c in GNU grep. #267
- seqkit range:
  - support removing leading 100 seqs (seqkit range -r 101:-1 == tail -n +101). #279
- seqkit subseq:
  - report error when no options were given.
- update doc:
  - seqkit head: add doc for "seqkit tail": seqkit range -N:-1 seqs.fasta. #272
  - seqkit rmdup: add the note of only the first record being saved for duplicates. #265

Contributors

photocyte

Assets 9

Loading

Uh oh!

There was an error while loading. Please reload this page.

Moonerss reacted with thumbs up emoji

apcamargo reacted with hooray emoji

All reactions

👍 1 reaction
🎉 1 reaction

2 people reacted

SeqKit v2.1.0

15 Nov 11:37

shenwei356

Compare

Choose a tag to compare

Loading

SeqKit v2.1.0

Changelog

SeqKit v2.1.0 - 2021-11-15
- seqkit seq:
  - fix filtering by average quality -Q/-R. #257
- seqkit convert:
  - fix quality encoding checking, change default value of -N/--thresh-B-in-n-most-common from 4 to 2.
    #254 and #239
- seqkit split:
  - fix writing an extra empty file when using --two-pass#244
- seqkit subseq:
  - fix --bed which fail to recognize strand ..
- seqkit fq2fa:
  - faster, and do not wrap sequences.
- seqkit grep/locate/mutate:
  - detect unquoted comma and show warning message, e.g., -p 'A{2,}'. #250

Assets 9

Loading

Uh oh!

There was an error while loading. Please reload this page.

multimeric and v-lakhujani reacted with hooray emoji

All reactions

🎉 2 reactions

2 people reacted

SeqKit v2.0.0

28 Aug 09:00

shenwei356

Compare

Choose a tag to compare

Loading

SeqKit v2.0.0

Changelogs

SeqKit v2.0.0 - 2021-08-27
- Performance improvements
  - seqkit:
    - faster FASTA/Q reading and writing, especially on FASTQ, see the benchmark.
      - reading (plain text): 4X faster. seqkit stats dataset_C.fq
      - reading (gzip files): 45% faster. seqkit stats dataset_C.fq.gz
      - reading + writing (plain text): 3.5X faster. seqkit grep -p . -v dataset_C.fq -o t
      - reading + writing (gzip files): 2.2X faster. seqkit grep -p . -v dataset_C.fq.gz -o t.gz
    - change default value of -j/--threads from 2 to 4, which is faster for writting gzip files.
  - seqkit seq:
    - fix writing speed, which was slowed down in v0.12.1.
- Breaking changes
  - seqkit grep/rmdup/common:
    - consider reverse complement sequence by default for comparing by sequence, add flag -P/--only-positive-strand. #215
  - seqkit rename:
    - rename ID only, do not append original header to new ID. #236
  - seqkit fx2tab:
    - for -s/--seq-hash: outputing MD5 instead of hash value (integers) of xxhash. #219
- Bugfixes
  - seqkit seq:
    - fix failing to output gzipped format for file name with extension of .gz since v0.12.1.
  - seqkit tab2fx:
    - fix bug for very long sequences. #214
  - seqkit fish:
    - fix range check. #213
  - seqkit grep:
    - it's not exactly a bug: forgot to use multi-threads for -m > 0.
- New features/enhancements
  - seqkit grep:
    - allow empty pattern files.
  - seqkit faidx:
    - support region with begin > end, i.e., returning reverse complement sequence
    - add new flag -l/--region-file: file containing a list of regions.
  - seqkit fx2tab:
    - new flag -Q/--no-qual for disabling outputing quality even for FASTQ file. #221
  - seqkit amplicon:
    - new flag -u/--save-unmatched for saving records that do not match any primer.
  - seqkit sort:
    - new flag -b/--by-bases for sorting by non-gap bases, for multiple sequence alignment files.#216

Assets 9

Loading

Uh oh!

There was an error while loading. Please reload this page.

jameslz, oschwengers, ashishdamania, apcamargo, cmdcolin, rujinlong, hlfernandez, and alienzj reacted with thumbs up emoji

camilogarciabotero, apcamargo, and alienzj reacted with hooray emoji

All reactions

👍 8 reactions
🎉 3 reactions

9 people reacted

SeqKit v0.16.1

20 May 00:40

shenwei356

Compare

Choose a tag to compare

Loading

SeqKit v0.16.1

Changelog

SeqKit v0.16.1
- seqkit shuffle --two-pass: fix bug introduced in #173 . #209
- seqkit pair: fix a dangerous bug: when input files are not in current directory, input files were overwritten.

Assets 9

Loading

Uh oh!

There was an error while loading. Please reload this page.

All reactions

SeqKit v0.16.0

16 Apr 05:41

shenwei356

Compare

Choose a tag to compare

Loading

SeqKit v0.16.0

Changes

SeqKit v0.16.0
- new command seqkit head-genome:
  - print sequences of the first genome with common prefixes in name
- seqkit grep/locate/amplicon -m
  - much faster (300-400x) searching with mismatch allowed by optimizing FM-indexing and parallelization.
  - new flag -I/--immediate-output.
- seqkit grep/locate:
  - fix bug of -m when querying contains letters not in alphabet, usually for protein sequences. #178, #179
  - onply search on positive strand when searching unlimited or protein sequences.
- seqkit locate:
  - removing debug info for -r introduced in a0f6b6e. #180
- seqkit amplicon:
  - fix bug of -m, when mismatch is allowed.
- seqkit fx2tab:
  - new flag -C/--base-count for counting bases. #183
- seqkit tab2fx:
  - fix a rare bug. #197
- seqkit subseq:
  - fix bug for BED with empty columns. #195
- seqkit genautocomplete:
  - support bash|zsh|fish|powershell.

Assets 9

Loading

Uh oh!

There was an error while loading. Please reload this page.

All reactions

SeqKit v0.15.0

12 Jan 14:39

shenwei356

Compare

Choose a tag to compare

Loading

SeqKit v0.15.0

Changes

SeqKit v0.15.0
- seqkit grep/locate: update help message.
- seqkit grep: search on both strand when searching by sequence.
- seqkit split2: fix redundant log when using -s.
- seqkit bam: new field RightSoftClipSeq. #172
- seqkit sample -2: remove extra \n. #173
- seqkit split2 -l: fix bug for splitting by accumulative length, this bug occurs when the first record is longer than -l, no sequences are lost.

Assets 9

Loading

Uh oh!

There was an error while loading. Please reload this page.

All reactions

SeqKit v0.14.0

30 Oct 01:17

shenwei356

Compare

Choose a tag to compare

Loading

SeqKit v0.14.0

Changes

SeqKit v0.14.0
- new command seqkit pair: match up paired-end reads from two fastq files, faster than fastq-pair.
- seqkit translate: new flag -F/--append-fram for optional adding frame info to ID. #159
- seqkit stats: reduce memory usage when using -a for calculating N50. #153
- seqkit mutate: fix inserting sequence -i/--insertion,
  this bug occurs when insert site is big in some cases, don't worry if no error reported.
- seqkit replace:
  - new flag -U/--keep-untouched: do not change anything when no value found for the key (only for sequence name).
  - do no support editing FASTQ sequence.
- seqkit grep/locate: new flag --circular for supporting circular genome. #158

Assets 7

Loading

Uh oh!

There was an error while loading. Please reload this page.

All reactions

Previous 1 2 3 4 5 … 8 9 Next

Footer

© 2025 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.