Corrected usage in cdhit-utility.c++#51
Open
ucpete wants to merge 1 commit into
Open
Conversation
The usage presented in the tools in the CD-HIT suite that take nucleotide sequences as input is incorrect. For example, when one calls cd-hit-est (v4.7), there are references to amino acids: <pre> -c sequence identity threshold, default 0.9 this is the default cd-hit's "global sequence identity" calculated as: number of identical <b>amino acids</b> in alignment divided by the full length of the shorter sequence </pre> There are several small inconsistencies throughout the usage; I have fixed them throughout the cdhit-utility.c+++ file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The usage presented in the tools in the CD-HIT suite that take nucleotide sequences as input is incorrect. For example, when one calls
cd-hit-est(v4.7), there are references to amino acids, e.g.:There are several small inconsistencies throughout the usage; I have made the appropriate fixes to the
cdhit-utility.c++file so that when one calls a nucleotide sequence tool, one sees 'nucleotide' or 'nt,' and when one calls a protein sequence tool, one sees 'amino acid' or 'aa.'These changes are all cosmetic, but as a user of the tool, I have been confused in the past about which tool I was using, or wanted to use, after reading the usage.