Hidden Markov Models
You can also see Python, Java, C++, C, Swift, Js, Php, or C# repository.
To check if you have a compatible version of Python installed, use the following command:
python -V
You can find the latest version of Python here.
Install the latest version of Git.
pip3 install NlpToolkit-Hmm-Cy
In order to work on code, create a fork from GitHub page. Use Git for cloning the code to your local or below line for Ubuntu:
git clone <your-fork-git-link>
A directory called Hmm will be created. Or you can use below link for exploring the code:
git clone https://github.com/starlangsoftware/Hmm-Py.git
Steps for opening the cloned project:
- Start IDE
- Select File | Open from main menu
- Choose
Hmm-PYfile - Select open as project option
- Couple of seconds, dependencies will be downloaded.
Hmm modelini üretmek için
Hmm(self, states: set, observations: list, emittedSymbols: list)
Viterbi algoritması ile en olası State listesini elde etmek için
viterbi(self, s: list) -> list
- Do not forget to set package list. All subfolders should be added to the package list.
packages=['Classification', 'Classification.Model', 'Classification.Model.DecisionTree',
'Classification.Model.Ensemble', 'Classification.Model.NeuralNetwork',
'Classification.Model.NonParametric', 'Classification.Model.Parametric',
'Classification.Filter', 'Classification.DataSet', 'Classification.Instance', 'Classification.Attribute',
'Classification.Parameter', 'Classification.Experiment',
'Classification.Performance', 'Classification.InstanceList', 'Classification.DistanceMetric',
'Classification.StatisticalTest', 'Classification.FeatureSelection'],
- Package name should be lowercase and only may include _ character.
name='nlptoolkit_math',
- Package data should be defined and must ibclude pyx, pxd, c and py files.
package_data={'NGram': ['*.pxd', '*.pyx', '*.c', '*.py']},
- Setup should include ext_modules with compiler directives.
ext_modules=cythonize(["NGram/*.pyx"],
compiler_directives={'language_level': "3"}),
- Define the class variables and class methods in the pxd file.
cdef class DiscreteDistribution(dict):
cdef float __sum
cpdef addItem(self, str item)
cpdef removeItem(self, str item)
cpdef addDistribution(self, DiscreteDistribution distribution)
- For default values in class method declarations, use *.
cpdef list constructIdiomLiterals(self, FsmMorphologicalAnalyzer fsm, MorphologicalParse morphologicalParse1,
MetamorphicParse metaParse1, MorphologicalParse morphologicalParse2,
MetamorphicParse metaParse2, MorphologicalParse morphologicalParse3 = *,
MetamorphicParse metaParse3 = *)
- Define the class name as cdef, class methods as cpdef, and __init__ as def.
cdef class DiscreteDistribution(dict):
def __init__(self, **kwargs):
"""
A constructor of DiscreteDistribution class which calls its super class.
"""
super().__init__(**kwargs)
self.__sum = 0.0
cpdef addItem(self, str item):
- Do not forget to comment each function.
cpdef addItem(self, str item):
"""
The addItem method takes a String item as an input and if this map contains a mapping for the item it puts the
item with given value + 1, else it puts item with value of 1.
PARAMETERS
----------
item : string
String input.
"""
- Function names should follow caml case.
cpdef addItem(self, str item):
- Local variables should follow snake case.
det = 1.0
copy_of_matrix = copy.deepcopy(self)
- Variable types should be defined for function parameters, class variables.
cpdef double getValue(self, int rowNo, int colNo):
- Local variables should be defined with types.
cpdef sortDefinitions(self):
cdef int i, j
cdef str tmp
- For abstract methods, use ABC package and declare them with @abstractmethod.
@abstractmethod
def train(self, train_set: list[Tensor]):
pass
- For private methods, use __ as prefix in their names.
cpdef list __linearRegressionOnCountsOfCounts(self, list countsOfCounts)
- For private class variables, use __ as prefix in their names.
cdef class NGram:
cdef int __N
cdef double __lambda1, __lambda2
cdef bint __interpolated
cdef set __vocabulary
cdef list __probability_of_unseen
- Write __repr__ class methods as toString methods
- Write getter and setter class methods.
cpdef int getN(self)
cpdef setN(self, int N)
- If there are multiple constructors for a class, define them as constructor1, constructor2, ..., then from the original constructor call these methods.
cdef class NGram:
cpdef constructor1(self, int N, list corpus):
cpdef constructor2(self, str fileName):
def __init__(self,
NorFileName,
corpus=None):
if isinstance(NorFileName, int):
self.constructor1(NorFileName, corpus)
else:
self.constructor2(NorFileName)
- Extend test classes from unittest and use separate unit test methods.
class NGramTest(unittest.TestCase):
def test_GetCountSimple(self):
- For undefined types use object as type in the type declarations.
cdef class WordNet:
cdef object __syn_set_list
cdef object __literal_list
- For boolean types use bint as type in the type declarations.
cdef bint is_done
- Enumerated types should be used when necessary as enum classes, and should be declared in py files.
class AttributeType(Enum):
"""
Continuous Attribute
"""
CONTINUOUS = auto()
"""
- Resource files should be taken from pkg_recources package.
fileName = pkg_resources.resource_filename(__name__, 'data/turkish_wordnet.xml')

