I decided to document this code, idk why but I am bored in class and just got access to my server, so I've got time.
This is based on my markov chain project, model_generator is from there, so you can see it's documentation there.
Main part of model_user is also the same as in my markov chain project, but I've added "evolution" algorithm to it, because I wanted to make word generation based on probability sum (higher = better, this is getbest function), but it made an issue that it almost always returned single worded sentences, because they always had prob of 1, I didn't wanted to make them impossible, so I wanted a way to determine what probability should single worded sentences have so that they wouldn't be impossible, but also wouldn't be too often. So there it is, TheMyOwnEvolutionKindaAlgorithmThatIsDumbButItWasFunToMadeSoILikeIt® (not an actual trademark).
There is class called student (Hehe, class called student, it's funny), it stores 3 pieces of data: student's setting, student's score and student's iteration. Setting is probability to set for single worder sentences. Score is calculated score of how close student was to getting wanted ratio of single worded sentences to multi worded ones. Iteration is just in which iteration student was created.
- Selected amount of students is genereted in a list, since this is first iteration student's have random1 settings. Iteration is set 0.
- Function test_student for each student is called.
- Function test_student measures performance of each student by:
- Uses getbest function with student.
- Checks length of returned sentence.
- If sentence is single worded, it adds 1 to int ones.
- Gets ratio of iterations by dividing ones by total number of iterations.
- Repeat steps from 3.1 to 3.4 until reached iterations specified in testing_steps variable.
- Calculate student's score by forumla
if expected ratio is bigger than actual ratio
score = 100 / (expected - ratio)
if expected ratio is smaller than actual ratio
score = 100 / (ratio - expected)
if expected ratio is equal to actual ratio
score = 100 / (1 / testing_steps)
- Set student's score to calculated score
- Compare students in student list by score.
- Set student with best score as best_student.
- Next iteration.
- Selected amount of students is generated in a list, half students are randomly1 generated, other half is generated based on best student setting, randomly1 changed by selected amount (step variable). Set iteration value in student to iteration it's currently in.
- Function test_student again is called for each student.
- Function test_student again measures performance for each student.
- Compare students in student list by score again.
- Set student with best score as best_student, if there isn't student with better score than best_student then best_student stays the same.
- Check if best_student improves:
- Compare current iteration with best_student iteration.
- If iterations are is the same, save it's values to a file, and continue.
- If iterations aren't the same, count for how long it hasn't been improved, if it didn't improved for 3 break iterations loop.
- Repeat steps from 7 to 12 until iterations end or best_student weren't improving.
- Save final best_student to file and return it.
This code can be optimized big time. Am I willing to? Idk maybe. It works for me and I don't see it being really used in any kind of production, so my only fuel to make it better would be self determination, I might do it later.