RAG & Text Embedding by mhughes2k · Pull Request #84 · bycs-lp/moodle-local_ai_manager

mhughes2k · 2025-09-19T08:33:15Z

Following conversations with Peter at Moot Global 25, I started looking at an approach to implement the key steps for RAG (embedding and retrieval) as "purposes" that would be available.

So this is some really early pre-cursor work trying to understand the bycs-lp model and concepts.

At the moment this is "scaffolding", the back-end parts of these are not yet implemented, and will require much more work, but in following the manager's models, I have also been able to update my "reference" activity chat module (https://github.com/mhughes2k/moodle-mod_xaichat/tree/ai_manager_version) to use both the AI Manager as it's AI provider, but also linking to call to these 2 new purposes to perform RAG.

The "RAG" purpose at the moment simply returns exactly 1 "document" (that lies that "yellow is a shade of blue"), but it's enough to see that the orchestration of these purposes should work.

This seems to be about 9 lines of code (once migrated to AI manager):

// $data is a stdClass object with the data from an mform.

$chatmanager = new \local_ai_manager\manager('chat');
$embeddingmanager = new \local_ai_manager\manager('embedding');
$ragmanager = new \local_ai_manager\manager('rag');
[...]
$embeddingrequest = $embeddingmanager->perform_request($data->userprompt, 'local_xaichat', $modulecontext->id);
$embedding = $embeddingrequest->get_content();

/*
* Docs are returned as simple string:
* Title: Document Title
* URL: https://someurl
* document / fragement content
*
* The RAG purpose would basically implement dynamic access checking on the resulting documents from the underlying
* store *before* returning them back out to a developer/user.
*/
$ragrequest = $ragmanager->perform_request($embedding, 'local_xaichat', $modulecontext->id);
$docs = $ragrequest->get_content(); 
$prompt = $data->userprompt;
    if (!empty($docs)) {
        debugging("Got RAG content returned:" . $docs);
        $prompt = "Use the following information\n\n{$docs} to answer: \n\n{$prompt}";
    }
[...]
$response = $chatmanager->perform_request(
    $prompt,
    'mod_xaichat',
    $modulecontext->id
);
    
$result = $response->get_content();

// Do stuff with the AI synthesised response.

Anyway I thought I'd open this draft PR early to allow for a conversation about this.

One issue I've already discovered, relating to the text-embedding, is that Open AI has a different endpoint URL to text embedding vs the endpoint that is currently encoded into the chatgpt tool (around about

moodle-local_ai_manager/tools/chatgpt/classes/instance.php

Line 62 in 4d769d6

if (!empty($enabled)) {

) and so I'm not sure how to approach this aspect.

I'd originally thought that this could simply leverage the existing "tools" plugins, but does this mean each tool needs to support a slightly different option/configuration for this extra action it can do.

Also I don't know what this would be like across all the AI tools or if they all approach, specifically the text-embedding process, differently.

My next steps is to start looking at what back-end engineering is necessary to implement text-embedding, and the more complicated aspect relating to the indexing, processing and storing of the vector data from Moodle content. (On this last one it did occur to me that text-embedding for the user-prompt and for the vectors needs to be the same...so having them as separate purposes could creating conflicts through configurations the admins could set up...i.e. they use different embedding models).

Anway, thanks to Peter for his time, I hope this is useful/interesting and please do re-direct me if I'm going off away from the approach or mis-understanding any thing (also the dev docs branch was really useful!)

Regards

Michael

PhMemmel · 2025-09-19T08:53:10Z

Hi @mhughes2k ,

thank you very much for this. I will try to have a look and pick up on your questions hopefully next week!

PhMemmel · 2025-09-19T08:54:01Z

Also I'm not sure, if you realized #76

I started to create some (dev) docs. Maybe this also already helps.

mhughes2k · 2025-09-19T08:56:17Z

Also I'm not sure, if you realized #76

I started to create some (dev) docs. Maybe this also already helps.

Yes, those were the docs I was meaning! Peter pointed me at them, it's really helpful. If there's anything I can contribute to those too, or you want an external person to look through happy to do so!

PhMemmel · 2025-09-19T09:08:09Z

oh yes, sorry, I totally missed the line you were mentioning in your post!

mhughes2k · 2025-09-19T15:50:05Z

After doing some more work I fudged getting text-embedding to work with the OpenAI plugin.
This had a few quirks and "hacks", since the endpoint part of the connector objects presumes that the end-point URL is the same, but it's fundamentally different for Text Embedding vs Chat Completion...

At the same time, I'm not necessarily sure that having a 2nd connector implementation for each AI Provider to simply have a different URL that it's calling is a good idea...so I've done some internal logic switching to offer an "end point" choice.

I think the limitation is due to the concept (correct me if I'm wrong) that a connector offers different models but they all have to go to the same end point...

So this allowed me to create an Open AI provider A for doing with purpose for doing chat completion and B with purpose for doing text-embedding. As far as I could see getting 1 provider to do both purposes would mean the connector has to know about the purpose more than I'd like.

Making these changes, I've got a rudimentary function to do a perform_request() call against an "embedding" purpose, and get back a string representation of the vector (you can see this working in https://github.com/mhughes2k/moodle-mod_xaichat/blob/ai_manager_version/view.php around about L123).

RAG however became more problematic, because this isn't actually an LLM interaction at all...it's a Vector DB search, it just needs to be a $"somevectordbplugin"->search($embedding); call, but with it's own eco-system of vectorDBs and indexing etc (which is basically just Moodle Global Search with some bells and whistles)...

However integrating the text-embedding function & vector DB search into the same purpose (and of course text-embedding could be left as a standalone purpose) does now actually seem to make more sense, than having a "just" embedding and "just" vector search purpose...

(hope this makes sense)

mhughes2k · 2025-09-23T13:51:06Z

OK I have a whole new approach to try out :-) will close this and re-raise once I've got my head around it.

mhughes2k · 2025-09-24T16:42:20Z

I have done a complete re-work of this. I have implemented the qdrant vectorDB as a backend and as a "tool" class. This is on the basis that it is effectively an HTTP endpoint.

The qdrant aitool can be added via the Ai tools menu:

With the following settings form:

For testing purposes I create a qdrant instance using docker and access it via the host.docker.internal name (which required relaxing the endpoint validation slightly).

Not entirely sure why but I ended up creating a qdrant collection with multiple vectors using:

PUSH http://localhost:6333/collections/moodle 
{
    "vectors": {
        "contentvector": {
            "size": 1536,
            "distance": "Cosine"
        }
    }
}

In addition I have defined both "(text-)embedding" and "rag" purposes:

In this case I've configured text-embedding to use an Open AI back end and I extended that connector to accomodate a 2nd end-point:

With all 3 of these implemented, I have currently implemented a very simple test script in the "rag" purpose.

This has 2 functions "store" and "retrieve" (use these values on ?action=XXX"):

Store

This simply places a "document" into the vector db:

    $storeprompt = json_encode([
        'action' => 'store',
        'content' => 'This is a test document. It is only a test document.',
        'metadata' => [
            'title' => 'Test Document',
            'author' => 'Moodle AI Manager',
            'source' => 'Generated',
        ],
    ]);

The perform_request() method will use the chatgpt connector (in this case) to perform a "rag" purpose, with a "store" action.

This is connected to the qdrant aitool, which in turn calls the "embedding" purpose to get the "content" vector, and then store the document and vector in qdrant.

Retrieve

The Retrieve action is simply a test that the same document is found again.

Next steps

Move the "options" for the request out of the prompt and into the request_options. This should make the "prompt" parameter clearer. In the case of a "store" action it should just be "document" or some reference to a "document", and for retrieval it should just be the query from the user.
Clean up debugging code.
Model is a redundant setting on the aitool for qdrant / vector dbs so this should probably get surpressed.
Azure settings do odd things with end-points (probably because aitools "expect" only 1 end point, but I didn't want to have to double the number of connectors to simply add an extra operation), so I had to disable the "freeze" on the endpoint to allow the selector to work...

… formatting

PhMemmel · 2025-09-25T04:46:27Z

Hi @mhughes2k ,

thank you so much for all your work. I had a quick look and it looks really promising to me. Regarding your next steps:

I agree. Besides that, if the purpose/call does not need a prompt, it's totally ok if the prompt is just empty and all request data just comes with the request options. The $prompt is really intended for direct user input. If for a call there is no user input to make, but it's all technical information (which should go to the request options) empty prompt is totally fine. Everything else the connector should take care of.
Always great ;-)
Yes, but you will have to make sure that a model in some way is specified, it is being used for logging etc. For example the aitool_option_azure also defines a hardcoded string and removes the model option in the mform, because the model is being configured in the Azure backend.
Yes, I saw the issue about different endpoints. Let me think about how this can be solved easily.

Also, we still have to think about how to make other connectors work. I'm still torn between should we use a separate connector plugin which makes implementation a lot easier and reduce the necessity of "if-else". If you look at the current structure, basically I could also have created a "unified OpenAI connector" being able to serve for text, image and speech generation. But I decided to split it up into 3 connectors aitool_chatgpt, aitool_dalle and aitool_openaitts.

Would it be helpful to make a different connector that only allows embedding? In this case I've already used two basic techniques: Inheriting from chatgpt and just adapt what I need (overwrite endpoint for example). In this case it should also be easy to handle the azure endpoint thing. Second option would be to create a different connector object and pass the calls to it like I did in the telli connector which is basically a wrapper for the chatgpt and dalle connector.

Maybe you're open to discuss some of the points and could show/explain me some more things. I'm gonna reach out to you via matrix.

Looking forward to it! I'm very excited about your code! :)

mhughes2k · 2025-09-25T15:45:14Z

Absolutely open discuss all the points :-) so please do grab me via Matrix

I did wrestle with the idea of a whole separate connector since the text-embedding is a different "mode", that probably makes most sense, but I did wonder on how much duplication of code also ends up being generated for what seems like it's really just a different end-point and payload form. However for the text-embedding, moving the code out into an aitool_openaite (openai text embedding) plugin makes sense...and deals with (3) in the "best" way, and allows for the same approach to be used for Open AI on Azure text-embedding to provide "extra" configuration.

However for the qdrant connector, as this has loads of different suffixes that are appended on to the base endpoint, I implemented some switching logic to make endpoint generation more dynamic, as I didn't think that a connector for each endpoint it needs to connect to made sense.

The option here would be to define (and I had a different branch that played with this) a new sub-plugin type, like "aidb_", but after I tried it (and duplicated lots of plumbing code to simply manage it), I found that just having it as an "aitool_" plugin worked quite well and was simpler.

Maybe a more "soft" categorisation of "aitools" to distinguish between "AI Provider Connectors" and "Vector DB Backends" is all that's needed, and it would eliminate an extra UI.

Other next steps that I forgot:

I'm going to think about content indexing. I'd have liked to have just re-used the Moodle Global Search Indexer, but I think that's too closely coupled to it's own back end DB engines for global search, so I was going replicate some relevant parts and put it under the "control" of the rag purpose. This indexer would then work in tandem with a configured aitool_* plugin that supports the rag purpose to get the "documents" into the backend.

…nd reverted changes to chatgpt.

mhughes2k · 2025-10-08T08:24:23Z

I've put the work I've been doing specifically on "indexing" into a separate branch for the moment: https://github.com/mhughes2k/moodle-local_ai_manager/tree/RAG_indexer

PM84 · 2025-12-16T08:00:31Z

Hello Michael,
I hope you are well!
What is the current status of the RAG development?
I am currently planning for the first half of 2026 and am considering whether and, if so, how many resources we need to allocate for this on our side. :-)
Best regards
Peter

mhughes2k · 2025-12-17T16:31:07Z

Hi Peter, I’m hoping to pick some work up on this again over the holidays and into the new year. At present there is a basic indexer and a Qdrant vector store implementation. The discussion around the implementation was around how to extend to other backends, I have so far created a generic Tool (abstract class) to represent vector dbs, in much the same way that multiple AI providers are implemented, so that the interfaces are the same or at least similar. The qdrant back end is the a “concrete” implementation of this abstract class, that the admin is able to instantiate through the current UI. M Sent from Outlook for iOS<https://aka.ms/o0ukef>

…

________________________________ From: Peter Mayer ***@***.***> Sent: Tuesday, December 16, 2025 8:00:53 AM To: bycs-lp/moodle-local_ai_manager ***@***.***> Cc: Michael Hughes ***@***.***>; Mention ***@***.***> Subject: Re: [bycs-lp/moodle-local_ai_manager] RAG & Text Embedding (PR #84) CAUTION: This email originated outside the University. Check before clicking links or attachments. [https://avatars.githubusercontent.com/u/7900120?s=20&v=4]PM84 left a comment (bycs-lp/moodle-local_ai_manager#84)<#84 (comment)> Hello Michael, I hope you are well! What is the current status of the RAG development? I am currently planning for the first half of 2026 and am considering whether and, if so, how many resources we need to allocate for this on our side. :-) Best regards Peter — Reply to this email directly, view it on GitHub<#84 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAFJUOI66MEXLKQNK3YFXRL4B633LAVCNFSM6AAAAACG6HU25WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTMNJZGI3TMNJYGU>. You are receiving this because you were mentioned.Message ID: ***@***.***>

Rag indexer

…to RAG

Reworked RAG

77b5aab

mhughes2k force-pushed the RAG branch from 22714e0 to 77b5aab Compare September 24, 2025 16:21

Merge branch 'main' into RAG

d552438

mhughes2k added 2 commits September 24, 2025 19:14

Removed some debugging

61a7ba4

Enhance purpose and connector classes with action handling and output…

d5a42d5

… formatting

mhughes2k added 5 commits September 25, 2025 17:29

Separated textembedding out of aitool_chatgpt into aitool_openaite, a…

034494f

…nd reverted changes to chatgpt.

Basic working Index and store

57fc6ba

Indexer scripts

1f00803

Display metadata from search

99078f1

Added indexer script

2a66a49

some issues with abstract methods

723aa50

PhMemmel force-pushed the main branch from 76a30e3 to d099a90 Compare October 21, 2025 06:55

mhughes2k added 2 commits November 19, 2025 11:35

Merge branch 'main' into RAG_indexer

0beebdf

merged main

405b1da

mhughes2k added 4 commits December 18, 2025 20:20

Merge pull request #2 from mhughes2k/RAG_indexer

34c9780

Rag indexer

Started readme for qdrant plugin

9d2aebf

Merge branch 'RAG' of github.com:mhughes2k/moodle-local_ai_manager in…

6974fc0

…to RAG

Added qdrant docker file

f9e7a7c

mhughes2k added 5 commits January 3, 2026 15:15

Started adding RAG support to submit_query call

53219a1

stashing whilst merging main

23a2049

Merged main

1dcb991

Merged latest main code

30c9f39

Merged latest main code

dfed197

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RAG & Text Embedding#84

RAG & Text Embedding#84
mhughes2k wants to merge 21 commits intobycs-lp:mainfrom
mhughes2k:RAG

mhughes2k commented Sep 19, 2025

Uh oh!

PhMemmel commented Sep 19, 2025

Uh oh!

PhMemmel commented Sep 19, 2025

Uh oh!

mhughes2k commented Sep 19, 2025

Uh oh!

PhMemmel commented Sep 19, 2025

Uh oh!

mhughes2k commented Sep 19, 2025

Uh oh!

mhughes2k commented Sep 23, 2025

Uh oh!

mhughes2k commented Sep 24, 2025

Uh oh!

PhMemmel commented Sep 25, 2025

Uh oh!

mhughes2k commented Sep 25, 2025

Uh oh!

mhughes2k commented Oct 8, 2025

Uh oh!

PM84 commented Dec 16, 2025

Uh oh!

mhughes2k commented Dec 17, 2025 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mhughes2k commented Sep 19, 2025

Uh oh!

PhMemmel commented Sep 19, 2025

Uh oh!

PhMemmel commented Sep 19, 2025

Uh oh!

mhughes2k commented Sep 19, 2025

Uh oh!

PhMemmel commented Sep 19, 2025

Uh oh!

mhughes2k commented Sep 19, 2025

Uh oh!

mhughes2k commented Sep 23, 2025

Uh oh!

mhughes2k commented Sep 24, 2025

Store

Retrieve

Next steps

Uh oh!

PhMemmel commented Sep 25, 2025

Uh oh!

mhughes2k commented Sep 25, 2025

Uh oh!

mhughes2k commented Oct 8, 2025

Uh oh!

PM84 commented Dec 16, 2025

Uh oh!

mhughes2k commented Dec 17, 2025 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants