Skip to content

Data Structure

TakenouchiTR edited this page May 4, 2022 · 4 revisions

Our dictionary of words is stored using three separate collections: an unordered_set and two vectors.

Original Implementation

Initially, we only used a single unordered_set as our data structure. We could randomly select words by generating a pseudo-index, then iterating through the set that many times. Since there will always be more operations checking if a word is valid than operations to get a randomly selected word, we preferred this over using a linear data structure like an array or vector.

Current Implementation

We noticed that application stop for seconds when selecting a word sometimes, therefore we created two vectors to store the words with instant random access: one that holds every word, and one that holds only words with unique letters. This allowed us to quickly generate a number for the index, then return the word stored at that index.

Other considerations

We did consider alternatives for faster random access on the set, but the implementations had uneven distribution of words. One example is to randomly generate a string of five letters, then try to add it to the set. If the word already exists, simply return the word. If it doesn't already exist, then get its iterator, go to the next element (or the first if the word is the last element), and return that instead. We felt that it was important to make sure that the distribution of words was even, therefore we decided to use extra collections over this method.

Clone this wiki locally