Natural Language Processing programs and projects (implmented in Prolog)
Problem: Prolog project to evaluate the correctness of english sentence using bigram model
Approach: The project constructs a Prolog bigram language model using small DA_Corpus.text corpus.
Steps taken (bigram_model.pl):
- The DA_Corpus.text corpus is normalized using unix commands.
- Created a prolog readable unigram.pl and bigram.pl database from normalized corpus.
- In the final step, implemented bigram_model.pl which computes the probability of any word sequence, of any size, via a predicate called calc_prob/2. The predicate calc_prob/2 works in log space and applies laplace smoothing on fly to compute the probability of given sentence.
Sample outputs: As shown in the output below, sentence like "the book fell" will have better value than "i fell on the book"
Similarly the sentence like "the book that he wanted fell on my feet" will have better value than "book the that he wanted fell on my feet"
Problem: Prolog program to convert Roman to Decimal and vice-versa till 20 numbers
Program: RomanDecimalConversion.pl
Sample outputs:
Problem: Identify all possible tags for given sentence with there correctness probability.
Approach: The project makes use of Viterbi algorithm to compute all the possible tag list with probability for given sentence. tagger.pl
Sample outputs:
Problem: Prolog project for finding the cosine similarity between two given words and finding most similar words of a given word
Approach: The project applies cosine distance rule to find the probability of two words similarity. This is then extended to identify and rank all the similar words for given word.
Problem: Develop a Natural Language interface for Fridge. The interface should be capable for parsing english sentence, evaluating the data from data model (mini database), and respond to user query appropriately.
Approach: The project is divided into three sub modules namely Parsing, ModelChecker, and Response.
Parsing module applies First Order Logic on tokenized input string to create the formula for given sentence. It does this by applying lexicons and rules of english grammer. The module uses augmented version of SR Parser (Shift Reduce Parser) to parse the sentence.
ModelChecker evaluate the output of Parser using model data (Prolog database for fridge). It identifies if the sentence was declarative, interrogative or content question.
Response module prints the result of ModelChecker based on response type.