Conversation
|
Hi @mhughes2k , thank you very much for this. I will try to have a look and pick up on your questions hopefully next week! |
|
Also I'm not sure, if you realized #76 I started to create some (dev) docs. Maybe this also already helps. |
Yes, those were the docs I was meaning! Peter pointed me at them, it's really helpful. If there's anything I can contribute to those too, or you want an external person to look through happy to do so! |
|
oh yes, sorry, I totally missed the line you were mentioning in your post! |
|
After doing some more work I fudged getting text-embedding to work with the OpenAI plugin. At the same time, I'm not necessarily sure that having a 2nd connector implementation for each AI Provider to simply have a different URL that it's calling is a good idea...so I've done some internal logic switching to offer an "end point" choice. I think the limitation is due to the concept (correct me if I'm wrong) that a connector offers different models but they all have to go to the same end point... So this allowed me to create an Open AI provider A for doing with purpose for doing chat completion and B with purpose for doing text-embedding. As far as I could see getting 1 provider to do both purposes would mean the connector has to know about the purpose more than I'd like. Making these changes, I've got a rudimentary function to do a RAG however became more problematic, because this isn't actually an LLM interaction at all...it's a Vector DB search, it just needs to be a $ However integrating the text-embedding function & vector DB search into the same purpose (and of course text-embedding could be left as a standalone purpose) does now actually seem to make more sense, than having a "just" embedding and "just" vector search purpose... (hope this makes sense) |
|
OK I have a whole new approach to try out :-) will close this and re-raise once I've got my head around it. |
|
Hi @mhughes2k , thank you so much for all your work. I had a quick look and it looks really promising to me. Regarding your next steps:
Also, we still have to think about how to make other connectors work. I'm still torn between should we use a separate connector plugin which makes implementation a lot easier and reduce the necessity of "if-else". If you look at the current structure, basically I could also have created a "unified OpenAI connector" being able to serve for text, image and speech generation. But I decided to split it up into 3 connectors aitool_chatgpt, aitool_dalle and aitool_openaitts. Would it be helpful to make a different connector that only allows embedding? In this case I've already used two basic techniques: Inheriting from chatgpt and just adapt what I need (overwrite endpoint for example). In this case it should also be easy to handle the azure endpoint thing. Second option would be to create a different connector object and pass the calls to it like I did in the telli connector which is basically a wrapper for the chatgpt and dalle connector. Maybe you're open to discuss some of the points and could show/explain me some more things. I'm gonna reach out to you via matrix. Looking forward to it! I'm very excited about your code! :) |
|
Absolutely open discuss all the points :-) so please do grab me via Matrix I did wrestle with the idea of a whole separate connector since the text-embedding is a different "mode", that probably makes most sense, but I did wonder on how much duplication of code also ends up being generated for what seems like it's really just a different end-point and payload form. However for the text-embedding, moving the code out into an However for the The option here would be to define (and I had a different branch that played with this) a new sub-plugin type, like "aidb_", but after I tried it (and duplicated lots of plumbing code to simply manage it), I found that just having it as an "aitool_" plugin worked quite well and was simpler. Maybe a more "soft" categorisation of "aitools" to distinguish between "AI Provider Connectors" and "Vector DB Backends" is all that's needed, and it would eliminate an extra UI. Other next steps that I forgot:
|
|
I've put the work I've been doing specifically on "indexing" into a separate branch for the moment: https://github.com/mhughes2k/moodle-local_ai_manager/tree/RAG_indexer |
|
Hello Michael, |
|
Hi Peter,
I’m hoping to pick some work up on this again over the holidays and into the new year.
At present there is a basic indexer and a Qdrant vector store implementation. The discussion around the implementation was around how to extend to other backends, I have so far created a generic Tool (abstract class) to represent vector dbs, in much the same way that multiple AI providers are implemented, so that the interfaces are the same or at least similar. The qdrant back end is the a “concrete” implementation of this abstract class, that the admin is able to instantiate through the current UI.
M
Sent from Outlook for iOS<https://aka.ms/o0ukef>
…________________________________
From: Peter Mayer ***@***.***>
Sent: Tuesday, December 16, 2025 8:00:53 AM
To: bycs-lp/moodle-local_ai_manager ***@***.***>
Cc: Michael Hughes ***@***.***>; Mention ***@***.***>
Subject: Re: [bycs-lp/moodle-local_ai_manager] RAG & Text Embedding (PR #84)
CAUTION: This email originated outside the University. Check before clicking links or attachments.
[https://avatars.githubusercontent.com/u/7900120?s=20&v=4]PM84 left a comment (bycs-lp/moodle-local_ai_manager#84)<#84 (comment)>
Hello Michael,
I hope you are well!
What is the current status of the RAG development?
I am currently planning for the first half of 2026 and am considering whether and, if so, how many resources we need to allocate for this on our side. :-)
Best regards
Peter
—
Reply to this email directly, view it on GitHub<#84 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAFJUOI66MEXLKQNK3YFXRL4B633LAVCNFSM6AAAAACG6HU25WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTMNJZGI3TMNJYGU>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|




Following conversations with Peter at Moot Global 25, I started looking at an approach to implement the key steps for RAG (embedding and retrieval) as "purposes" that would be available.
So this is some really early pre-cursor work trying to understand the bycs-lp model and concepts.
At the moment this is "scaffolding", the back-end parts of these are not yet implemented, and will require much more work, but in following the manager's models, I have also been able to update my "reference" activity chat module (https://github.com/mhughes2k/moodle-mod_xaichat/tree/ai_manager_version) to use both the AI Manager as it's AI provider, but also linking to call to these 2 new purposes to perform RAG.
The "RAG" purpose at the moment simply returns exactly 1 "document" (that lies that "yellow is a shade of blue"), but it's enough to see that the orchestration of these purposes should work.
This seems to be about 9 lines of code (once migrated to AI manager):
Anyway I thought I'd open this draft PR early to allow for a conversation about this.
One issue I've already discovered, relating to the text-embedding, is that Open AI has a different endpoint URL to text embedding vs the endpoint that is currently encoded into the chatgpt tool (around about
moodle-local_ai_manager/tools/chatgpt/classes/instance.php
Line 62 in 4d769d6
I'd originally thought that this could simply leverage the existing "tools" plugins, but does this mean each tool needs to support a slightly different option/configuration for this extra action it can do.
Also I don't know what this would be like across all the AI tools or if they all approach, specifically the text-embedding process, differently.
My next steps is to start looking at what back-end engineering is necessary to implement text-embedding, and the more complicated aspect relating to the indexing, processing and storing of the vector data from Moodle content. (On this last one it did occur to me that text-embedding for the user-prompt and for the vectors needs to be the same...so having them as separate purposes could creating conflicts through configurations the admins could set up...i.e. they use different embedding models).
Anway, thanks to Peter for his time, I hope this is useful/interesting and please do re-direct me if I'm going off away from the approach or mis-understanding any thing (also the dev docs branch was really useful!)
Regards
Michael