Our dataset now contains much of the course-related content from the lse calendar page. However, if the database does not include some information on a particular course, it will answer with the information from another.
Potential Solution
Collect all the course related data from the dataset and conduct network analysis over it (using NetworkX as recommended by Jon).
Hypothetical Pipeline (with vector search).
- Use function calling to categorise the user query as being course-related.
- Focus the attention in the pipeline towards the various communities and similarities between all the course-related content (within itself and the user query).
- For example, regarding the BSc Economics programme, the relevant data should include all courses available on this programme (as well as other relevant information) and only this.
- Point the vector search to only the documents which fit the criteria.
Note A preliminary experiment uses a filtered dataset for course-related content, which can be found on the SharePoint, as it is too large to be uploaded to Github.
Our dataset now contains much of the course-related content from the lse calendar page. However, if the database does not include some information on a particular course, it will answer with the information from another.
Potential Solution
Collect all the course related data from the dataset and conduct network analysis over it (using NetworkX as recommended by Jon).
Hypothetical Pipeline (with vector search).
Note A preliminary experiment uses a filtered dataset for course-related content, which can be found on the SharePoint, as it is too large to be uploaded to Github.